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Foreword 


The discovery in our time that scientific methodology can 
be applied to human problems has revolutionized psychology and 
has seriously affected all branches o£ social science. This discovery, 
moreover, came during a period when problems of social adjust* 
ment had reached a critical point in the years of depression, of 
war, and of postwar crises. As a consequence, empirical and quanti- 
tative research in our field has seen unparalleled growth. This 
period of boom has naturally not been characterized by a high 
degree of order or of systematic development of theory and meth- 
odology. We have been too absorbed in doing research to plan 
thoroughly, to take stock of our progress, or to communicate our 
findings adequately and to inform one another of our techniques 
and approaclies. 

The first great break in this pattern came with the publication 
of Studies in Social Psychology in World War II. Stouffer and his 
collaborators took time out to set forth their findings and their 
methods in communicable form. These volumes were an excellent 
demonstration of the importance of the codification of research 
methods. In the early days of the social studies, there %vas justifica- 
tion for scholars to give the result of their insights and reflections 
without specifications concerning the %vays in which they arrived 
at their interpretations, for in that period they were working more 
as intuitive artists than as scientists. But today, when we attempt 
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experimentation and quantification, we liave no excuse for failing 
to codify our procedures. ^ ^ , 

One essential aspect of scientific technique is that it can be 
stated in a standard form and can be taught so that trained arid 
competent investigators can apply it in the same fashion. We still 
have not achieved the degree of specification possible in the physical 
sciences. The method of the inlervieiy. for example, still combines 
art and science. It is only, however, through malting our procedures 
explicit that we can test, crlticire, and improve ifiera. 

Psychologists, social psychologists, and sociologists at the Uni- 
versity of Michigan have felt fortunate in the favorable atmosphere 
for social research at their institution which has made possible many 
projects in the academic departments and the creation of the 
Institute for Social Research with Its coordinate divisions of the 
Survey Research Center and the Research Center for Group Dy- 
namics. Since this research development brought together many 
specialists, it seemed worth while to take advantage of their physical 
and psychological proximity to produce a book on methodology. 
Two purposes were dominant: (I) to help in tlie present trend 
toward codification of research techniques, and (2) to give graduate 
students in the field some understanding of the principles and pro- 
cedures of modern methodology. The criterion (or inclusion of meth- 
ods was the degree of relevance to the problems of social psychology, 
and the criterion for exclusion los the availability of knowledge 
about a technique already standardized in anotlier field. Thus, al 
though factor analysis is a useful method in social science, the details 
of its application have already been described in statistical texts. 
Similarly, projective methods have been described in the personality 
context in which they are chaiacterislically used. On the other 
hand, there has been a lack of detailed treatment of behavioral 
observation, of the qu. mitative analysis of qualitative materials, 
and of such major research settings as field studies and field experi- 
ments. 

There has been another underlying purpose in the publication 
of this book. It is our belief that progress in any field must rest 
upon methods appiopriaie to that field. Although the basic logic 
of scientific methodology U the same in all fields, its specific tech- 
niques and approaches 'will vary, depending upon the subject 
matter. In us early stages, social psychology was handicapped by a 
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lack of methods appiopriatc to its pioblcnis. In general, the recruits 
from the upper frontier of social science understood its larger 
problems but were unequipped as technicians to handle them. The 
technology came from the lower frontier of individual psychology, 
where there had been a long development in psychophysics, in 
laboratory methods, and in psychometrics. 

The attempt to apply this type of technology to social psychol- 
ogy was much too literal and failed to consider the appropriateness 
of the technique to the problem under investigation. Hence the 
earlier efforts to t«l Freudian concepts "weTe fruitless. In industrial 
psychology, precision measures of isolated motor performance were 
inadequate to cope with problems of fatigue and motivation. The 
item-reliability technique of the psychometrician was no answer 
in itself to the need for measures of cognitive and motivational 
structure in dealing with social attitudes. 

The real problem is not that techniques cannot be adapted to 
a variety of problems but that they tend to carry with them the 
type of thinking and even the concepts of the area in which they 
were ^veloped. Thus, the experimental technique when first 
applied to social psychology attempted manipulation of the amount 
of social stimulation— t.c., the sheer physical dimension, as in the 
"alone and together” experiments. The creation and manipulation 
of the specific social influences came as a later development. Thus, 
when old techniques are used in the social field they have to be 
adapted to the conceptual framework in which they are applied. 
Olhenvise wc shall find ourselves testing things other than the 
theories In which we are interested. Moreover, the special problems 
of our field call for new approaches and new techniques. The tradi- 
tional measurement procedures involved assumptions not necessarily 
met by social data. The development of new scaling methods and 
of nonparametric statistics arc hopeful signs of progress in this 
respect. 

Finally, the social rcsearcltcr should consider his research de- 
sign from the point of view of testing the ^gnificant theories in his 
own field rather than from the frame of reference of what he v/ould 
bo doing if he were determining a sensory threshold. It is our con- 
viction that methodologies need to be written for the field of social 
psychology itself. 

Most of the contributors to this volume are social psychologists. 
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This means that the problems they discuss tend to be taken from 
the field of social psychology. It is our belief, however, that there 
are many areas in the social sciences to which the approaches and 
methods described in the following pages will have application. 
For areas in the social sciences which deal with relationships be- 
tween group indices without reference to intervening variables, 
these methods may need the same sort of adaptation to meet the 
criterion for appropriateness as was demanded in the field of social 
psychology when it took over techniques from individual psychology. 

A cooperative undertaking of this sort requires not only the 
assistance of the contributors of the chapters which follow but the 
support of their colleagues. We are indebted to Donald Marquis, 
who participated in the planning of the project and who bears 
much of the responsibility for the circumstances which made the' 
hook possible. Other participants in the project not formally repre- 
sented in the following chapten were Eugene Jacobson, Lowell 
Kelly, Charles Metzner, Ian Koss, and Guy Swanson. In roost books 
there is generally one person who carries the brunt of editorial 
work, and it has been our good fortune to have had Mrs. Emily 
Willerman for this role, which she has carried out with unusual 
devotipn and competence. 

University of Michigan 
June 21, 1953 


L. F. 
D.K. 
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INTRODUCTION 


The Interdependence of 
Social-Psychological Theory 
and Methods; A Brief Overview 


Theodore M. Newcomb 


It is a truism that no research results are any better than 
the methods by which diey are obtained. Behind the platitude, 
however, lie many complexities. Between the initial sensing of a 
problem and the final application of research results to that prob- 
lem, there lies many a choice, as the reader of this volume will 
discover. At each dividing of the path, moreover, there are diverse 
criteria lor deciding what is "hetter." Just as Molifere's IVI. Jourdain 
was astonished on discovering that he had been speaking prose all 
his life, so not a few experienced researchers in social psychology will 
be amazed, on completing the baker’s dozen of chapters that follow, 
to learn how many decisions they have been making these many 
years— with or without knowing it. It is one of the objectives of this 
volume to create a more general awareness of the existence of the 
choice points and of the criteria by which decisions may be made. 

Ko article of faith in the scientist’s credo is more elementary 
than his empiricist conviction that, if he learns how to ask the 
proper questions of •‘nature," he can formulaic the principles accord- 
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ing to which "nature” behaves. If our questions are not properly 
pul— i.e., if our observations are not suitably made— in ihe first place, 
no amount of interpretative ingenuity at a later stage will enable 
us to reach our research objectives. Such methodological problems 
as devising interview schedules, selecting a sample of persons or of 
written words, manipulating a variable in ^the laboratory, or con- 
structing an objective test are all problems of ensuring tliai the 
questions which we put to "nature" will be maximally suitable to 
elicit the answers required by our objectives. Problems of scaling, 
categorizing, discovering covariation, and testing the significance 
of differences are not merely matters of "translating” data already 
obtained; they are basic in the sense of deiermining-whcther we 
knmv it or not— the kinds of questions which we are putting. What- 
ever truth or falsity inheres in our research findings is quite as 
much a function of the questions we have elected to ask through 
our selection of methods as o! the logic we have applied to the data 
elicited by our questions. 

The kinds of research problems described in the following 
chapters have been attacked within a very limited time-space setting. 
The methodological weapons which have been devised have, like 
military weapons at a given time and place in history, been con- 
ditioned by their setting. Some aspects of this setting impinge alike 
upon every variety of rescardi into human behavior, some have had 
a special impact upon social research, and still others have influenced 
in specific ways the somewhat dimly demarcated field of social- 
psydiological research. This brief introductory chapter attempts to 
point to some of the contemporary methodological problems for 
social research in general and for social psychology in particular, 
and to note the position of the social psychologist in the confra- 
ternity of social researchers, as he borrows from and lends to his 
fellow members in the common enterprise. 


SOME COMMON PROBLEMS IN 
SOCIAL RESEARCH 

^ S^al scientists face certain human problems whicli the natural 
scienust u spared. As we shall note particularly in Part I, these 
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our research mechanisms in action But phenotypic phenomena are 
not necessarily the most significant ones to observe, even as de 
pendent variables, and as intervening or independent variables the) 
have. It seems to this writer, little better than chance likelihood of 
being the most significant ones In clinical psychology, for example, 
a Rorschach W may be a more significant variable than the more 
humanly phrased “social expansiveness “ Just so, in social science 
we probably have more to gam by identifying genotypic X’s (with 
or without human sounding labels) than by seeking to refine our 
measures of readily observable, human phenotypes Helen Peak, in 
Chapter 6, has examined some of the properties of significant van 
ables in terms of “functional unity ’ 

y Another problem of which social scientists of nearly every stripe 
are becoming increasingly aware has to do with the decisions they 
make when they eraplo) a given process of measurement In Chap 
ter 5, Leslie Kish points to some of the consequences of using one 
device rather than another at various stages of sampling procedures 
Keith Smith, in Chapter 12, points out some of the assumptions 
involved m the use of what may be our favorite statistical pro 
cedures and suggests alternatives which many* of us will find more 
appropriate, once we are aware of the nature of the statistical deci 
sions we have been making 

Clyde H Coombs, in Chapter 11, goes to the very roots of the 
question “What is the nature of measurement iiselD ’ Since we are 
necessarily doing something when we transform ‘ real events into 
numbers— whether at the stage of making observations or at the 
stage of analysis— it behooves us to know what we are doing The 
requirements of this transformation process, together with certain 
properties of the events which the social saentist studies, confront 
him with a special dilemma This chapter is characteristic of the 
lone of the entire book instead of presenting recipes to be followed, 
it seeks to understand the logic of a type of problem which social 
scientists frequently meet Since for the social scientist every mvesti 
gation situation includes a large component of uniqueness and since 
(ns Donvin CartwTight notes in Chapter 10, for example) the deci 
sions made at every step of the imcstigation process are dependent 
upon decisions made at other steps, the investigator himself must 
construct his own blueprint 
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It may happen, of course, that a specific investigator will spe 
cialize in a given method sheerly because of the conveniences or 
the necessities of division of labor But this is not the same thing 
as speaalization of method for a given problem Hunches emerging 
from one sort of methodological study will often need to be tested 
by other methods, and results confidently confirmed by one method 
may turn out to be so loaded with the situational factors necessarily 
associated with the use of that method as not to be confirmable at 
all when other method situation complexes are used The more 
‘ basic' the problem, in general, the more essential it is that it 
be investigated programmatically by a designedly wide range of 
methods 

Another, and doubtless closely related, handicap under which 
social ps)chology currently suffers is that of too rapid growth If its 
development has not been exactly hypertrophic, the tendency has 
certainly been to incorporate more than has been digested John 
R P French, Jr observes in Chapter 3, for example, the relatively 
low levels of abstraction at which social psychological propositions 
are of necessity frequently formulated And Daniel Kali notes in 
Chapter 2 the relative rarity with which social psychological inves 
tigaiions have been replicated To an unknown extent both of these 
conditions are attributable to considerations of methodology When 
we shall have achieved more satisfactory and more ' standardized ’ 
methods of investigation, with clearer cniena concerning the range 
of appropriateness of various methods and tools, we shall, of course, 
be in a much better position to make genuinely comparable studies, 
which ire a necessary precondition to high level abstractions And, 
similarly, a more satisfactory armamentarium of tools and metliods 
w ill hahiate the replication of significant studies 

One s assessment of the consequences of these and other present 
shoitcomings of social psychology wil! depend upon the nature of 
one’s hopes and expectations and upon ones definition of the field 
If one regards social ps)cholog\ as an applied field, one will be 
dissatisfied with the present sntc of affairs primarily on grounds 
that the applicability of a given principle to a given situation is 
highl) uncertain But if one looks a* social psychology as itself 
ronstituting a bod) of iheor), the nature of ones concern sull be 
quite different Furthermore, the sources to which one looks for the 
improvement of methmls smII vary considerably, accorthng is ones 
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SPECIAL PROBLEMS IN SOCIAL PSYCHOLOGICAL 
RESEARCH 


In adduion to the hatntds vth.ch betct all kmds of socal 
teseaich, social psychology ts subject to some 'special <lisaMities 
sut e-enens Many of these stem from its interdisciplinary parentage 
and Its still ambiguous boundaries From the point of view of range 
and richness, social psychology has gamed much by its generous 
borrowings from individual psychology, sociology, cultural anthro- 
pology, and psychiatry But the cost has been heavy We draw upon 
a wide and only loosely integrated variety of concepts Our sources 
of data are as diverse as the analytic couch, the laboratory, the 
playground, the factory, the community, and the random sample 
of adults m a total soaety It is likely, indeed, that many social 
psychologists are not even aware that they have at their disposal 
so broad a range of materials as those noted by Robert C Angell 
and Ronald Freedman in Chapter 7 It is not surprising, therefore, 
that we have recoune to a wide range of methods in making obser 
vations. in isolating, measuring, and controlling our variables, and 
in analyzing our data Under such conditions the difficulties of 
discriminating among settings and methods of apparently equal 
relevance and serviceability increase exponentially with the range 
of alternatives Social psychological research seems at midrentury, 
to be peculiarly subject to these difhculties 

There is, of course, nothing inirmsically undesirable m having 


recourse to a wide range of settings and methods, diversity, like 
adversity, has many uses Several of the following chapters note 
possible* ways— or even necessary ones— of taking advantage of this 
situation Thus it is pointed oui m one or more of the chapters m 
Part I that field studies have called attention to an increasingly wide 
range of variables which are manipulable or controllable in the 
laboratory, that field studies and surveys can supplement each other 
in significint ways, that certain kinds of problems are furthered by 
a planned sequence of quite different methods and that, finally, 
genuinely programmatic research will very often necessitate such 
planned sequences In somewhat similar vein Leon Festinger, in 
ChapicT 4, discusses the speaal problem of ‘'artificiality” in the 
laboratory as contrasted with the “realness of other situations 
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It may happen, of course, that a specific investigator will spe- 
cialize in a given method sheerly because of the conveniences or 
the necessities of division of labor. But this is not the same thing 
as specialization of method for a given problem. Hunches emerging 
from one sort of methodological study will often need to be tested 
by other methods, and results confidently confirmed by one method 
may turn out to be so loaded with the situational factors necessarily 
associated with the use of that method as not to be confirmable at 
all when other method-situation complexes are used. The more 
“basic” the problem, in "general, the more essential it is that it 
be investigated programmatically by a designedly wide range of 
methods. 

Another, and doubtless closely related, handicap under which 
social psychology currently suffers is that of too rapid growth. If its 
development has not been exactly hypertrophic, the tendency has 
certainly been to incorporate more than has been digested. John 
R. P. French, Jr., observes in Chapter 3, for example, the relatively 
low levels of abstraction at which social-psychological propositions 
are of necessity frequently formulated. And Daniel Katz notes, in 
Chapter 2, the relative rarity with which social-psychological inves- 
tigations have been replicated. To an unknown extent both of these 
conditions are attributable to considerations of methodology. When 
we shall have achieved more satisfactory and more “standardized” 
methods of investigation, with clearer criteria concerning the range 
of appropriateness of various methods and tools, we shall, of course, 
be in a much better position to make genuinely comparable studies, 
wVndn trre a necessary precondition to ’mg*n-*ieve^ abstractions. And, 
similarly, a more satisfactory armamentarium of tools and metliods 
tvill facilitate the replication of significant studies. 

One’s assessment of the consequences of these and other present 
shortcomings of social psychology will depend upon the nature of 
one’s hopes and expectations and upon one’s definition of the field. 
If one regards social psychology as an applied field, one ivill be 
dissatisfied with the present state of affairs primarily on grounds 
that the applicability of a given principle to a given situation is 
highly uncertain. But if one looks at social psychology as itself 
constituting a body of theory, the nature of one’s concern will be 
quite different. Furthermore, the sources to which one looks for the 
improvement of methods will vary considerably, according .*iv one’s 



8 Reseorch Methods 

definition of the field stresses its applied or ns theoretic! aspects 
This brings us to the reciprocal of our mitial truism No method 
IS any better than the theory by which it is tested This point, 
touched upon in several chapters, is most explicitly made by Helen 
Peak, m Chapter 6, and Roger W Heyns and Alvin F Zander m 
Chapter 9 To the extent that our objectives center about applica 
bihiy, we shall be content with empirical tests of their adequacy 
This way, however, does not lead in the direction of generalizabihty 
and high* levels of abstraction Particularly m a field cliaractenzed 
by diverse kinds of settings and multiple divergent situational 
determinants, this way leads, at best, to a cookbook hke compendium 
of directions Even from the point of view of applicability it is a 
discouraging way, since the possible concatenations of situational 
variables are almost infiniie 


Conversely, to the extent that we are moving toward objectives 
of high level generalizabihty, we shall look to theoretical tests of our 
methods as the crucial ones-ie, crucial m ways such as those sug 
gested by Helen Peak But since this book is devoted to social 
psychological methods the question is thus raised concerning the 
existence or even the possibility of a genuinely social psychological 
theory 

To such a question there would probably be no unanimous 
reply from the several authors of this volume— much less from the 
somewhat miscellaneous body of their colleagues who refer to them 
selves as social psychologists But, among the present authors at 
least, the differences spring from nothing more consequential than 
differences m the use of labels and in notions of what are proper 
boundary lines among disciplines All would agree that the processes 
by which persons relate themselves to one another and simultane 
ously to other aspects of their environment occur m orderly faslnon 
and are subject to scientific investigation This writer likes to 
apply the label 'social psychology to this area of investigation 
He may differ with some among his colleagues as to the degree to 
which the principles accounting for these orderly processes represent 
applications of pnnciples borrowed from neighboring disciplines 
If so. there arc differences among us as to the independent status 
of social psychology but not as to the importance of establishiuf' 
the principles at the highest possible level of abstraction 
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SOCIAL-PSYCHOLOGICAL AS RELATED TO 
OTHER RESEARCH METHODS 

Suppose it be agreed that there exists an area of theoretical 
importance which does not lie entirely within the confines of either 
sociology or psychology. Let it be accorded a quasi-independent 
status and labeled, for the time being, "social psychology." And 
suppose it be also agreed that there exists a body of methods char- 
acteristically used by social psychologists, not all of which are used 
by either sociologists or psychologists. Does it follow that there must 
also exist a quasi-independent body of social-psychological methods? 
It might follow from the demands of an exaggerated professional 
pride but not from those of logic. Such distinguishability as social 
psychology has from its neighboring disciplines inheres in whatever 
distinctiveness characterizes the problems to which it addresses itself; 
the methods by which it attacks them might or might not be distinc- 
tive. In actual practice, there is little distinctiveness of method. 

It is evident, even from a quick scanning of the Table of Con- 
tents, that the methodological problems treated in this book are by 
no means the exclusive or necessarily the primary concern of social 
psychologists. If there are basic methodological problems in the 
planning of laboratory experiments, for example, social psychologists 
have not been the first nor will they be the last to be faced by them. 
Social psychologists, being relative latecomers on the scene of 
research in human behavior, have been pioneers in few of the 
areas here represented, and in some instances their contributions 
have been relatively minor ones. Strictly speaking, there are prob- 
ably no social-psychological methods as such. The opening chapter, 
by Angus Campbell and George Katona, for example, emphasizes 
the interdisciplinary nature of survey methods. 

Nevertheless, the total contribution of social psychologists to 
social-research methodology has been considerable. Precisely because 
their methodological problems have not been entirely distinctive, 
they have been in a position to lend. What they have had to lend 
has not been of their sole creation. They have had to borrow, but 
they have also had to adapt what they have borrowed, and in adapt- 
ing they have invented. It is their marginal status which has made 
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this both necessary and possible Having one foot in pioneer tern 
lory, they felt the necessity to invent having the other foot m tern 
tory at least partly explored, they had some awareness of methods 
available to be borrowed and adapted 

Nowhere is this better illustrated than m the still unfinished 
problem of quantifying qualitative data Sociologists and psycliolo 
gists had long since developed their indices of discrete, objectively 
countable events and—bortowing trom biologuts— their statistical 
devices for analyzing their data But many of the data crucial for 
the social psychologists purposes tvere not discrete and countable 
Being unwilling to give up the advantages of quantitauve control 
and at the same time aware of the unsuitability of standard 
parametric statistics the social psychologist was forced to reconsider 
the whole research process This meant keeping in mind from first 
steps to last the interdependence of data-gathenng methods and 
the statistical methods by which they were to be analyzed Aware 
ness of such problems has resulted not only in the complete recasting 
of the methodology of attitude research but also in the development 
of new theories of scaling and even of measurement itself 

The marginahty of the soaal psychologist s concerns may force 
him (0 become methodologically inventive but it is no guarantee 
of the adequacy of his inventions There must be selectivity as well 
as inventiveness, and selectivity presupposes a keen sense of the 
appropriateness of methods to problems But criteria of appropriate 
ness may conflict with other criteria For example, most social 
psychologists have been fairly well schooled in the axioms of rigorous 
objectivity If we cannot measure something we are tempted as 
0)dc H Coombs points out in Chapter U, to go ahead and measure 
it anyhow Such methodological compulswcness defeats our basic 
objectue of remaining maximally faithful to the events which we 
observe The proper solution to this dilemma of rigor vs faithful 
ness lies not in abandoning ciUicr objective but m reassessing the 
means by which ngor is attainable, given that a certain sort of event 
IS to be mvesiigaied True ngor lies not in slavishlj borrowing the 
'standards developed lor other purposes but in combining maki 
mum faithfulness to eroptneal events with maximum reproducibility 
of procedures Specific standards of rigor arc relative, and not trans 
fcrable 

If social psydiology has led to methodological inventiveness 
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it lias not been because of any Jurisdictional label ■which its 
practitioners have worn. Rather, sensitivity to characteristically 
social-psychological problems has mothered invention because it 
presupposes the kind of cross-disciplinary familiarity which fosters 
discrimination among different sorts of data and among different 
sorts of methods. This, in turn, enhances the sense of appropriate- 
ness of methods to data and of both to basic theoretical problems. 


SIGNPOSTS TO METHODOLOGICAL 
DEVELOPMENT 

If it is true that research results are no better than the methods 
by which they are obtained, it is also true that the maturity of a 
discipline can be gauged by the spread of methodological sophistica- 
tion among its practitioners. By this test, social-psychological re- 
search is perhaps not very far developed, but its rate of development 
seems nonetheless encouraging. The publication of three works 
within as many years— the present one, the 1951 volumes entitled 
Research Methods in Social Relations (I), and a forthcoming Hand- 
book in Social Psychology (3), each devoted in whole or in part to 
methods of sodal-psychological lesearch— is both a sign of a felt need 
for such sophistication and a probable cause of its spread. One of 
the necessary conditions for the growth of science is the replicability 
of its findings; this depends upon standardization and codification 
of its methods, which, in turn, depend upon communication through 
■s'oih -as iVna tme. 

Viewed in developmental perspective, the present position of 
social-psychological research thus seems a reasonably promising 
one. Among other factors which will determine Us position in years 
to come, one is of particular importance: methodological research. 
At more than one point in this book the contributor has been forced 
to rely upon lore— i.e., common sense and the cumulative "wisdom’' 
of experienced researchers. This, too, should be communicated, but 
it is no substitute for controlled research procedures. Tlie “wisdom,” 
moreover, may turn out to be very unwise indeed. There has been 
considerable difference of opinion, for example, as to the advantages 
sometimes claimed for the face-to-face interview. But the necessity 
for basing our procedures upon mere opinions begins to disappear 
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when It IS shown that certain speafic differences, as predicted by a 
theory that barriers to communication vary with role relationships, 
distinguish interview responses and responses to printed question 
naires by the same persons (2) 

The authors of this volume have been able to draw upon a 
limited pool of methodological research findings, to which they have 
themselves contnbuted As social psychologists apply these research 
procedures to their own methodological problems, this pool will 
continue to expand. 
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Kesearck Settings 


Empirical research in social psychology, sociology, and re* 
lated areas proceeds in a variety of settings and contexts. The choice 
of setting for any research project is generally guided by the nature^ 
of the questions being asked and the degree of control desired. 

The variety of settings with which we must, then, concern 
ourselves may be ranged along a continuum from broad to narrow. 
We have chosen, for the sake of cwnvenience, to divide this range 
arbitrarily into four sections. The boundaries between any two 
adjacent sections are never perfectly dear and sharp, but the division 
seems to have some validity in that different considerations and 
different techniques become important as we move from one to the 
otlier. 

The broadest setting is, of course, one in which a large and 
perhaps spread-out population is designated for investigation. Here 
sur\’ey methods are usually employed. Intensive study of field situa- 
tions usually requires a narrower setting, such as a certain com- 
munity or organization or industrial plant. When experiments are 
to be done in real-life settings, it is usually necessary to narrow the 



focus even more. Laboratory expeiiments are usually done with 
small numbers of people in very narrow settings. 

U is no accident that as we move from broad to narrow settings 
we increase the amount of omrol that is possible. The narrower 
the setting becomes, the more likely it is that the investigator will 
be able to hold certain things constant, vary others at will, and 
achieve greater precision of measurement. The broader settings, 
however, contribute heavily to our knowledge of the patterning 
of variables and of their significance in life situations. There should 
be a continuing interaction from field to laboratory settings and 
from laboratory to fiehl. 



CHAPTER ONE 


The Sample Survey: 

A Technique for Social- 
Science Research 

A. Angus Campbell and George Katona 


Many research problems require the systematic collection 
ol'data. from populations or samples of population through the use 
of personal interviesvs or other data-gaihering devices. These studies 
are usually called surveys, especially when they are concerned with 
l arge or widely dispersed groups of pepple._\\^hen they deal with only 
^ of a total population (or universe), a fraction representa- 

tive of the total, thej' are called sample sun’eys. 

The basic survey proceduf^*^s used in the social sciences, is 
made up of a combination of techniques ivhich have been developed 
in various research discipline. The procedure of interviewing, for 
example, are based largely on the ecperience of psychologists, 
anthropologists, and others svho used the personal interview both 
as a research tool and as a means of diagnosis or therapy long before 
it v.’as adapted for surs'cy use. Technique of scaling and other 
methods of measurement have been borrowed from both sodology 
and psychology. Sampling methods have come in part from agricul- 
tural economio. Methods of content anal) sis ha\e been drawn from 
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a variety of fields including political science Techniques ot statisli 

cal analysis of mass data are common to all fields ot quantitative 


research in the social sciences 

The survey instrument is not the specific method of any one 
soad science discipline and it is broadly applicable to problems in 
many fields It is this capacity for wide application and broad cover 
age which gives the survey technique its great usefulness in the be 


havioral saences 


Surveys depend on direct contact with those persons or a sam 
pie of those persons whose characteristics behaviors or attitudes are 
relevant for a specific investigation Thus the survey method differs 
from research carried out m libraries or archives by studying re- 


grouping and analyzing records compiled for other purposes 

The survey technique is used only when the desired information 
cannot be obtained more easily and less expensively from other 
sources It i^ould be very inefficient for example to conduct a sam 
pie survey to determine the number of passenger cars in use in the 
United States Since American automobile owners must register their 
cars every year, a study of the files of the state licensing bureaus can 
provide such information much more rapidly and reliably But no 
information is available from any records on the occupations inten 
lions habits or other characteristics of automobile owners Similarly 
annual data are compiled mostly from records kept for practical 
and legal purposes about the total national income or the national 
birth rate Information about the income or birth rate of such sub 


groups of the population as inhabitants of large cities skilled work 
ers or college graduates can be computed for certain years from the 
Decennial Census (which represents in a sense an application of 
the survey technique) But sample surveys are necessary to find the 
answers to such questions as How many and whal kind of American 
families Ind an income of less than 510OO in each of several succes 
sue years? or Are people with high incomes more optimistic about 
the future than people with low incomes or What are the dif 
fercnces m the birth rate of people ol different religious affiliations? 

Sample surveys can be undertaken only if the people selected as 
respondents are able and willing to give the desired information It 
would not be very rewarding to ask a sample of the population 
to report their ‘'asm metaboUc rate or to tell the inters lewer whether 
they have repressed hosuhly feelings toward their father Such 
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information requires specialized techniques of determination, and 
few people would be able to report these facts with any degree of 
validity. Furthermore, it cannot always be assumed that people will 
be willing to give information which they may reasonably be ex- 
pected to have. Any information which might embarrass or incrim- 
inate the respondent is likely to be suppressed or distorted. Never- 
theless, the tolerance of respondents for inquiries into their personal 
affairs is surprisingly high provided the inter^'iew is conducted 
skillfully and tactfully. This willingness of most respondents to give 
detailed information about themseh*es cannot be taken to mean, 
however, that the people who are asked questions in a survey will 
give answers with the impersonality of an electric tabulator. It is 
often easier to specify the data which a survey hopes to assemble 
than it is to devise and carry out the interview through whicli the 
data can be successfully obtained. 

Surveys vary greatly in their scope, their design, and their con- 
tent. As in any other research, the specific characteristics of any 
survey will be determined by its basic objectives. The statement 
of the essential questions which the research is intended to investi- 
gate delineates in large part the universe to be studied, the size 
and nature of the sample, the type of interviewing to be used, the 
content of the questionnaire, the character of the coding, and the 
nature of the analysis. Specific survey methods vary according to 
specific survey objectives. 


TYPES OF UNIVERSES SURVEYED 

The most widely known sample survey conducted in this coun 
try is the poll conducted by the Gallup organization. This journal- 
istic enterprise has been asking questions of the American public 
since 1936 , and iu reports are a familiar weekly feature of many 
newspapers. Usually the Gallup Poll is intended to represent the 
total adult population; in its pre-election straw votes it attempts to 

represent the population of eligible voters. , 

Although the Gallup Poll has been the most highly pubhcizeti 
survey of the national population numerous other nation-wide sur- 
veys are conducted each year. Perhaps the most ambitious of thrae 
is tile Current Population Survey, carried out by the Bureau ol 
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Census Th.s survey o( some 25,000 households is conducted montldy 
and IS the source o! much important information on housing and 
employment (19) 

Another well known national survey is the Survey of Consumer 
Finances, which has been conducted annually since 1946 for the Fed 
eral -Reserve Board by the Survey Research Center (23, 43, 45) 
This survey, based on a sample of some 3500 households, provides 
data each year for such basic economic statistics as the distributions 
of national income savings, liquid assets, and purchases of homes 
and durable goods, as well as on economic altitudes and motnauons 
In England, the national population is surveyed frequently by 
the Soaal Survey, a governmental agency whose function is to col 
lect information through sample surveys for interested departments 
of the British Government (35) One of the major studies made by 
this office IS the Survey of Sickness which has been conducted each 
month since 1947 to determine the incidence and distribution of 
sickness, incapacitation, and use of medical services (39) 

A number of national studies are made in the United States 


each year by the numerous commercial polling organizations These 
surveys are usually carried out under contract with private business 
organuations and are customarily confidential Since the methods 
and results of most of these studies are not made public they have 
been of relatively luile value to social scientists 

Many surveys take the national population as their universe, 
because the nation is a basic political and economic unit This is 
especially likely to be true if the surveys are done tor governmental 
agencies whose policies affect the enure nation or for business organ 
izations whose markets are national in scope Geographical areas 
within the nation are also studied when the objectives of the survey 
require special regional analysis or the representation of specific 
states, counties, or ciues For example the state of Washington sup 
ports the -Washington Opinion Research Laboratory, which, as a 
branch of the two universities of the state, conducts surveys of the 
population of Washington on questions of public interest (15) 
Numerous studies of individual cities have been made, usually 
in connection with specific local problems In 1951, the University 
of Michigan undertook a conunuing study of the Detroit metro 
politan area based on. annual sample surveys This project serves 
as the \ehide for a variety of social science investigations 
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Although geographically deBned populations of these kinds are 
perhaps the most common basis for sample surveys, there are many 
other universes which may be studied People of a certain occupa 
tion, for example, may be the subject of study The Department 
of Agriculture conducts a number of surveys each year in which it 
IS interested only in farms (38) Often it may define its universe 
as cotton farmers or soybean farmers, perhaps also restricting the 
area covered, so that the universe may include only so narrowly 
defined a population as farmers in the state of Illinois operating 
farms of more than thirty acres with soybeans as a mam crop 

Housewives are a frequently surveyed population, usually in 
relation to such information as preferences for packaged foods, meth 
ods and extent of canning and preserving of foods, incidence and 
size of home gardens, and the like Some household information, 
particularly financial data, cannot be obtained satisfactorily horn 
housewives, and in such cases the universe would be defined is he uK 
of households 

Other occupational classifications have served as the basis for 
occasional surveys in which the objectives required information from 
such speaalized groups During World War JI, for example, ihere 
was a large scale program of survey research m the United States 
Army, intended to answer a wide variety of practical questions rang 
mg from soldiers' preferences for dilTercnt articles of clothing to 
their opinions on the determination of points for their ultimate 
discharge (42) During the same period tlicre were a number of 
studies of shipyard workers, steel workers and war plant workers 
of various kinds, undertaken to help solve the pressing problems of 
housing, transportation, absenteeism, and morale vvhich existed in 
production centers during that period (26) 

Many surveys require samples of populations which arc dislin 
guished by some common behavior or experience Tliere have been 
surveys of veterans, of college graduates, of subscribers to certain 
magazines, of visitors to state parks, of people who ride trams and 
of many other equally speaalized kinds of people Such samples 
are selected because these people have especial significance m rela 
lion to the objectives of the study For example, a Survey Research 
Center study designed to assess the factors which influence the 
purchase of homes drew its sample from a universe of recent home 
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purchiisen people v,ho had only shortly before gone through iht 

experience of selecting a home (47) —-nhir 

Surveys may also base their samples on purely demographic 
characteristics A study of Negro reaction to discriminatory practices 
obviously requires a sample drai^n from a universe of Negroes 
An analysis of the market for cosmetics is likely to gel Us information 
from women rather than men A survey of the problems attendant 
on reiiremem will restrict us sample to people of retirement age A 
study of acculturation might require a sampling based on national 
origin Or a survey might very well take into account a number of 
other factors and restnrt us universe, (or example, to white men 
between the ages of twenty and thirty whose families have been in 
this country for at least two generations The determination of the 
character of the universe depends largely on the objectives of the 
study 

Sometimes the objectives of a survey call for a universe which 
cannot be sampled until a prior screening survey Ins been made 
To find a sample of people of recent German descent, for example, 
u would probably be necessary to interview a relatively large sample 
of the general population, determining the national origin of each 
respondent From this total group, those respondents who identified 
themselves as of recent German origin could be drawn out as a 
subsample meeting the requirements of the original objectives This 
procedure is usually called * double sampling and is useful when 
ever the universe in question has some specific characteristic which 
1 $ not closely associated with paruculai Jocahties or is. not oiherwise 
identifiable 

In 1948 the Survey Research Center found it desirable to inter 
view a sample of that part of the, national population which is rela 
lively well informed regarding international affairs This was accom 
phshed by drawing, from the samples of three previous surveys 
made by the Center in the area of international affairs those respond 
ents who had shown themselves well informed regarding these issues 
(44) This sample within a sample was remterviewed The apph 
cation of the technique of double sampling provided a representa 
tion of tlie well informed section of the population 

Survey methods can be applied to the study of a small Iiighfv 
selected universe as well as to broad segments of the population 
such as those mentioned m the preceding paragraphs There arc 
many research problems in social science for which only compava 
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lively few individuals in the total population are relevant. The study 
of political leadership, for example, would need to concentrate on 
those infrequent individuals who fulfill the definition of leader. 
Research into the characteristics of legislative action would have 
to concern itself primarily with people serving in the role of legis- 
lator. An investigation of the behavior of business corporations 
would require information from those individuals responsible foi 
decisions governing the actions of such corporations. For such studies 
a cross-section survey of the general public would be entirely inap- 
propriate. 

A Survey Research Center study of factors affecting industrial 
mobility provides an example of a specialized survey for which only 
relatively few individuals could serve as respondents (25, 46). A 
major objective of this study was to assess the influences which impel 
industrial executives to locate their plants at one or another of the 
various sites available to them. The sample was selected randomly 
from a list of all the firms located in a certain geographical area 
(the state of Michigan). In eaclt of the two hundred firms compris- 
ing the sample, a major executive was interviewed regarding the 
advantages and disadvantages of the location of liis plant. Decisions 
on relocating plants or selecting sites for expansion are ordinarily 
made by a few high-ranking individuals, and they are the only 
{|ualified respondents for a study having these objectives. 

Although the most impressive uses of the survey technique 
during the past ten years have been in the sampling of large, hetero 
geneous populations, it seems probable that there will be increasing 
application of this research method to populations of a more re- 
stricted character. As social science develops conceptual schemes, 
based on observations and leading to netv observations, it is likely 
that the survey method (as well as all other available research 
methods) will be applied more sharply to the study of people of 
specialized characteristics subject to specifically defined circum- 
stances. 


TYPES OF SURVEY DESIGN 

Whenever survey data are to be gathered, there must be a deci- 
sion as to the specific pattern or design which the daia-collecting 
will follow. Every scientist attempts to arrange the conditions of his 
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research so that the data which are forthcoming will bear mn« 
effectively on the hypotheses he is attempting to test or the inlotni i 
tion needs he is trying to fulfill This is as Inic of the survey re 
searcher as it is of any other 


The Unweighted Cross Section 

The most famiUar and the simplest sur\e> design is the single 
time unweighted cross section This is the method p ir excelleme 
for the determination of the characteristics of a population at a 
specific point in time For example the systematic selection of escry 

nth card from the register of the undergraduates ol College 

would provide the basis for a description ol that body as to age 
sex high school record college entrance test scores college j,iades 
or any of the other items of information which appeir on the sam 
pled cards Mean scores and distributions could be obnined for 
each of these characteristics 

The sample data would of comse also make possible the cross 
analysis of all of these items so that comparisons ol grade averages 
of students of high and low high school records could be mndc 
correlations between grades and entrance test scores for each college 
year could be compared and many other statistical analyses could 
be carried out 

The principal objective of such correlational analysis is to 
identify causation through the technique of inference An example 
may be found in Cartwrights studies of the relation of personal 
solicitation of prospective buycrJ'of government bonds during ^\ orUI 
War II to the actual purchase ol these bonds (12) Comparisons of 
people who had been personally asked to buy bonds with those who 
had not showed approximately twice as many bond buyers in the 
first group as in the second This discrepancy was not reduced when 
such factors as ^ge income education and place of residence were 
held constant The inference is strong that individual purchases of 
bonds were significantly aflected by personal solicitation 


The JVeighled Cross Section 

A variation of the basic cross-section survey design is the 
weighted cross section This involves the deliberate oversampling of 
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some subgroup of the designated universe which has especial iinpor- 
tanre for ilic objectives of the survey but is known to be a relatively 
small fraction of the total population. Thus, in designing a national 
survey for the study of the uses made of public libraries, the Survey 
Research Center unbalanced its sample design to bring into the 
sample a larger number of recent users of libraries than would have 
been caught in an \»mveighicd cross section (10). This was done 
by doubling tl»e sampling rale in city blocks having high icntals, 
on the assutnption that people who pay high rents aic more likely to 
visit jjubiic libraries. The resulting increase in the number of library 
users interviewed made it possible to describe the characteristics 
and habits of this important group with greater confidence than 
would have been possible otherwise. 

Oversampling is especially useful in surveys dealing with the 
cUslribution and use of income and savings (4S). It is well known 
that wealth is distributed unequally in this country, some individuals 
having incomes and assets many times the national average. It is also 
apparent that in a sample of a few thousand households the influence 
that a relatively small number of these divergent cases can exercise 
on the total sample is appreciable. To reduce the sampling error 
of the data obtained from the top income receivers, and to make 
possible the analysis of these people as a separate group of the total 
population, it is cusiomar)* in such surveys as the Survey of Con- 
sumer Finances to increase the total number of high-income people 
by oversampling the areas in which they are most likely to be found. 
Whenever oversampling of this kind is done, it is necessary, of 
course, to weight these cases down to their proper contribution to the 
total sample when the data are analyzed. 


Contrastmg Samples 

It is sometimes more efficient to draw samples from subgroups 
which contrast in the variable most important to the study than 
it is to sample the entire universe. 

This design is tvcll illustrated in a Survey RcsearcJi Center 
study of the influence on public attitudes of the proximity of atomic 
energy installations (16). TIic purpose of this study was to find 
out whether the establishment of atomic energy reactors tended to 
produce insecurity and apprehension in the surrounding communi- 
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ties To provide the most effective test ot this question, samples were 
interviewed m several cities situated within twenty five miles of 
major atomic installations . similar samples were interviewed in 
cities paired with the atomic energy cities in geographical, indus 
trial, racial, and olher charactensiics but situated at some distance 
from any major atomic center 

A variation of the same design has been used by R C Angell 
in a study oE ‘ moral integration** of cities (1) On the basis of an 
examination of such indices as homiade rates, average contributions 
to Community Chest, and the like, Angell was able to order a list 
of American atics according to an index ot 'integration ’ He then 
chose two cities from both the high and low extremes of this scale 
and interviewed samples from the population of each The purpose 
of this study was to ascertain the degree to which the factors which 
had produced the original differentiation of the cities as ' well ’ 
or "poorly integrated’ were reflected in the svay people in these 
cities evaluated their communities and identified themselves with 
them 

The rationale of the contrasting sample design is that the effects 
or correlates of a variable thought to be important can be most 
clearly seen if situations are studied which provide the greatest 
extremes m the presence ot this independent variable Presumably 
(actors which do not vary even under these conwasiing conditions 
are not being influenced by the variable m question As in all such 
studies in which only the extremes of a distribution are observed, 
there is danger in assuming that a difference m these extremes 
reflects a linear relation throughout the total range of the variables 
considered This, of course, does not necessarily follow 


Successive Cross Sections 

Studies of change necessarily require measurements at successive 
pomis in ume In survey research, two types of study utilize the 
procedure of successive sampling from the same population ihe 
before after design and the study of trends 

No technique is more common m the total array of research 
procedures than the before and after measurement of a variable 
to test the effect of a stimulus, an event, or a change which has been 
introduced between the first and second measurements In social 
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science the range of change produang factors which are the subject 
of study IS very broad indeed Sometimes the«e factors are manipu 
lated as experimental variables by the researchers, as in the study 
of the effect of speaal preschool training on the I Q of young chil 
dren Very commonly, however, the social scientist is interested 
in the effect of events over which he has no control— for example, 
a declaration of war, an act of Congress, a race not, the advent of 
television, or a population movement Although these events are 
beyond his control, he can often anticipate their occurrence and 
arrange his measurements accordingly 

A simple example of the before and after design is found in 
a Survey Research Center study done in 1946 at the time of the 
tests of atomic bombings on Bikini atoll (7) The major objective 
of this study was to measure the effect which this highly publicized 
event would have on American thinking regarding the atomic bomb, 
on public anxiety concerning its use, and on popular estimations of 
Its military significance A nationwide sample of adults was inter 
viewed in June just prior to the Bikini tests These people were 
asked a series of questions relating to various aspects of atomic 
energy and the atomic bomb In August, after the results of the tests 
had been widely publicized and discussed, a second sample similar 
to the first, was interviewed This second sample was asked the same 
questions which had been asked in June as well as a number of 
special questions relating to perception of what the Bikini tests had 
shown A comparison of the results of the two surveys showed that, 
although there was some surprise that the Bikmi bombs had not 
done greater damage, there was no essential change xn public percep 
tions or attitudes regarding the atomic bomb 

A similar design was used by ilie Nitional Opinion Research 
Center in a study of the effcctucncss of a campaign conducted in 
Cincinnati intended to educate the population of that city regarding 
the activities of the United Nations (40) A survey conducted before 
the campaign established base line scores of public interest in the 
United Nations, infonnalion regarding us activities and -utmides 
toward its accomplishments This study indicated ilie sections of the 
population which showed themsches to be most in need of cnlighi 
enment ’’ A comparable sampling six months later, after the cam 
paign, demonstrated the extent to which people had been rc^ched 
by the campaign and had been influenced by it 
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The study oi trends differs from the before and after design only 
in that mote than two measurements are taken and the measure 
mems ate spaced over a continuing period rather than on either 
side of a snecilic event One of the best known trend studies done 
by sample surveys is Cantrll s study of changes in American attitudes 
in the period prior to Pearl Harbor toward entry into World War 
II (11) Through a series ol surveys conducted dunng 1940 and 
1941 Canml was able to trace the gradual rise of public sentiment 
for lid to England for resistance to Japanese aggression and for a 
declaration of war on Germany 

In a different area of information the British Surrey of Sich 
ness reports each monih on prevailing rates of sickness incapacity 
and medical consultation This survey of the British population 
makes possible the analysis not only of seasonal trends but also of 
long term changes m the state of public health In the United States 
the Survey of Consumer Finances makes possible the study of 
annual fluctuations m the economic status of the population 

The same series of surveys may readily serve both to follow 
trends and to study changes before and after a specific event Tins 
is well illustrated by Logans analysts of data from the Survey of 
Sickness in which he compares rates of sickness and medical care 
during the year prior to the institution of ilie National Health Serv 
ICC in July 1948 with rates m the succeeding twelve months (33) 
From this study it was possible to estimate the increase in reported 
illntss (about 5 peicein) and medical consultations (about 13 per 
cent) which followed the inauguration of the Biitish system of 
socniircd medicine 

lagans study also illustrates a rnoie detailed t\pe of nnalysis 
which is possible with data from successive surveys Comparisons 
through time can be mide not only of the total universe represented 
but also of many subpopulations within the total Thus Logan 
was able to show that the increases in illness and medical consiiln 
non were greater among women than men among older than 
younger people and among low income rather than high income 
families 

Tlic Survey o! Sickness utilises a turlher nicety ot design which 
IS esiiccially useful when there is need for a large sample This is the 
pioredure of oierlapping samples In the Survev of Sickness the 
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respondent is asked to leport on Ins medical expenences of the 
previous two months, a sulficicmiy short length of time to minimize 
memory error Each month a new sample of 4000 adults is inter 
viewed. By combining the reports of each t\\o successive months' 
samples for the one month they both report, it is possible to create 
an effective sample of 8000 reports for each month 


Remteruiews 

There are some types of survey objectives which require succes 
sive interviews with the same individuals This may involve only one 
interview and one reinterview or it maj mean a scries of interviews 
extending over months or years 

The reinterview design is used when it is necessary to follow 
the activities or attitudes of the same individuals through a specified 
time period This is illustrated by a stud) of buying intentions 
reported by Lansing and Withey in which a national sample of 
consumers was asked at the beginning of the )ear whether they 
planned to buy an automobile or other durable goods during the 
ensuing twelve months (23, 29) At the end of the >ear, these same 
people were interviewed again and were asked whether they had 
bought a tar or other durables duiiiig the year past The objective 
of this study was to analyze the n uurc of the planning and decision 
making jireceding a purchase, ind this could be done only by fol 
lowing through on the stated jilans of specific consumers 

A similar design was used by Campbell and Kahn in their study 
of ilie presidential vote in 1918 (8) One of the objectives of this 
study was to measure the final shifts from slated intentions to vote to 
the final decision on Election Day Although the aggregate changes 
from October to November could have been measured successive 
unrelated samples in these two months individual shifts could not 
have been analyzed without successive interviews of the same people 
before and after the election 

In some cases multiple intei views aie taken with the same 
respondents simpl) because the objectives of the survey arc too 
extensive to be covered in a single intenicw In the Indianapolis 
study of fertility (49), for example, four separate interviews were 
necessary in each of tlie sample households to cover all the questions 
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which had to be ashed In such a siiuanon. there is no “d™ "Uge 
in spacing the successive interviews and they will oidinarily be 
ducted as close together as possible 

Samples which are interviewed repeatedly over an exiende 
period o! time are usually referred to as panels (50) The use o 
the panel design is perhaps best illustrated by the study of the presi 
dential vote conducted in Ene County, Ohio in 1940 by Lararsfeld, 
BereJson, and Gaudet (31) This elaborate project was intended to 
follow the vagaries of the individual voter along the path to his 
vote, and to discover the relative effect of various influential factors 
upon his final vote.” U used a sample of six hundred people who 
were interviewed seven times m successive months. May through 
November By this procedure it was possible to observe individual 
shifts m inclinations toward one or the other candidate during the 
campaign and to relate these fluctuations to specific influences or 
snrauh which led to the change 

Beyond its usefulness for the analysis of factors producing 
individual change, the panel design has two further virtues The 
more obvious of the two is that the same sample interviewed twice 
» a more sensune measurement of change than two separate sam 
pies of the same universe This results from the mtercorrelation of 
variables which is at its maximum m the reintervicw design More 
over, when the same sample is interviewed two or more times, the 
vanations implicit m the conduct of field work lend to be repeated 
and thus conelated 


The panel design also has the advantage of making possible 
the description of how the constituency of the various economic and 
social strata of society changes through time This type of study 
would be appropriate, for example, to an analysis of the extent 
to which the top bracket of income earners in this country is com 
prised of the same people from year to year A single survey is suf 
ficient to demonstrate that a certain percentage of the people have 
incomes over $10 000 m a specific year, but it would require successive 
surveys of the same sample to find out what percentage of the people 
have five year incomes exceeding $10,000 each year Some facts can 
be reliably recalled over a considerable period of time but lor manv 
facts such as income, the memory error is so great that reporting 
over more than relatively short periods is quite undependable This 
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necessitates repealed additive reports if longitudinal data are to be 
gathered reliably. 

An application of this feature of the panel technique is found 
in the monthly Current Population Survey. This study uses the same 
respondents for six successive months and can thus estimate not 
only the total number of unemployed in the nation each month but 
also the total movement into and out of this group. An increase ol 
100,000 in the total unemployed group from one month to the next 
might mean that all the unemployed in the first month remained 
out of work during the second month and were joined by an addi- 
tional 100,000 previously employed people. However, it might mean 
that 300,000 people unemployed in the first month went to work 
in the second month and their places in the unemployed group were 
taken by 400,000 people who had been at work during the first 
month. This total movement of 700,000 people can be followed In 
the Current Population Survey because changes in the work status 
of the individual members of the sample are reported month after 
month. 

Two factors act as deterrents to the use of the panel design. 
The first is the virtually inevitable mortality which occurs in any 
population sample over even a brief period of time. In a cross section 
of the national population, a loss of 25 percent or more is to be 
expected after an interval of one year. A large part of this is due 
to people’s moving from one place to another, some of it to refusals 
to be interviewed a second time, and the rest to the numerous cir- 
cumstances which beset the efforts of even the best field organiza- 
tions, A Joss oi this proportion docs not ncccssartiy resuk in a biased 
sample but it creates the, possibility of serious bias. 

The second serious problem associated with the use of panels is 
the possibility that the continued interviewing will so sensitize and 
change the respondents that they are no longer representative of the 
universe from which they were drawn. In the study of the effect 
of the Bikini atom bomb tests, for example, it was thought inadvis- 
able to reinterview the before-test sample after the test for fear that 
die first interview would call the attention of these people to the test 
3nd stimulate them to follow the news of the event more closely thati 
they otherwise would have. One can easily imagine that respondents 
'vho know they are going to be interview’cd month after month on 
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questions of foreign affairs, for example, svill consciously or other- 
wise prepare themselves for the next interview. 

The cfFect o£ these two factors on the data gathered from a 
panel can be measured by interviewing a control sample,' inde- 
pendent of the panel, usually at the end of the panel period. If the 
data from this fresh sample differ from those ol the last panel survey 
by more than would be expected from errors of sampling, there 
is reason to believe that reinterviewing has introduced bias. 


VARIETY OF CONTENT STUDIED 

The versatility of the sample survey lies not only in the variety 
ol populations to which it may be applied or in the choice of designs 
which are available but aUo in the broad scope of data which may 
be gathered. Any fact which respondents are able and willing to 
tell an imerviesver may become the subject of a survey study. 

Surveys may supply answers to the questions “how many" and 
“how much." Some such data are available from public records— 
as, for example, the number of those who voted for a candidate in 
an election, or the total income of all Americans in a given year, 
but sample surveys are uniquely qualified to answer such questions 
as "How many people believe this country is loo aggressive in its 
foreign policy?" or “How many people use a public library more 
than ten times during a year?" or “How many families own stock 
in business corporation* and how large is their aveiage holding?" 

Such information about the frequency of certain opinions or 
activities in the total population represents in a certain sense only 
a preliminary phase of survey research. Surveys are intended pri- 
marily to answer the questions of "who," “how," and "why." Who 
are the people who own ihcir homes, or own common stock, or vote 
RepubUcan-that is, what is the occupational, educational, age etc., 
distribution of homeowners, stockholders, or Republican voters? 
How is it that some people have contributed to a Red Cross cam- 
paign whereas others have not; have they been solicited, have they 
known about the Red Cross before, and, if so, what, and what are 
their attitudes toward the Red Cross? Why do people use a public 
library-for purposes of additional training, enjoyment, or both? 
I. all these respects, the contribution of surveys is unique and serves 
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to enrich our store of knowledge— tliat is, our understanding of ;\hat 
has happened and our ability to predict nhat happen 

The content of survey questions nny be classified in vaiious 
tvays The following classification divides the total range ol qucs 
lions into four broad areas of content 

Penonal Data 

Surveys often include questions legaiding the sex, age occiipa 
tion, education, religion, nationality, group membership and many 
other personal social characteristics of the respondents Similarly, 
they may contain questions about the size of income, assets debts, 
and other economic variables The purpose of these questions is not 
so much to determine the incidence of these characteristics in the 
population as it is to provide the basis for the analysis of the relation 
of sex, occupation, or income to oilier data obtained in suiveys 
{9, 21, 24) 

P'nvironmenial Data 

In many surveys it is important to know certain facts legard 
ing the circumstances in which the respondents live Tliese might 
include data about the character of the local neighborhood, the 
adequacy of the living quarters, or the proximity of friends or 
relatives 

A study of library use, for example, would need to determine 
the relative availability of library serv ices A study of family relations 
''ould probably require information on the location and degree of 
contact with parents and ‘ in laws " A survey of home accidents made 
the School of Public Health of the University of Michigan in 
eluded a detailed investigation of the homes of the respondents 
‘heir lighting, floor plan, location of rugs and furniture, condition 
of stairs, and the like (36) Knowledge of such environmental 
f=’cts IS often needed to explain the behavior which is the suneys 
principal object of study 


^^havioral Data 

Many survey questions deal ssuh the aet.ons or 
f«Pnndents In the eeonomie field, for example, spend.ngand sav.ng 



32 Research Seltings 

(purchases ot Government savings bonds, houses, automobiles, tele- 
vision sets, etc) have been studied in surveys so as to determine 
the frequency of these activities in a given period and their relation 
to other activities, and to determine the characteristics of those who 
have engaged in them Behavior which is only partly economic is 
studied in surveys when questions are asked abour geographical or 
occupational movements, vacation trips, or visits to physiaans 
In ihe study of political behavior, s rvey questions about voting, 
writing to one’s Congressman, soliciting party contributions, and 
the like are relevant 

The analysis of information getting behavior would require 
questions on newspaper reading radio listening, television viewing, 
movie attendance, conversations, and other related activities The 
total range of behaviors which might interest a survey planner is 
obviously very wide 

Level of Infomalton, Opinions, Attitudes, 

Molmes, and Expcctotions 

This broad area of ' psychological" data includes many of the 
most interesting questions available to survey analysis It is also the 
area in which there are least likely to be data available from non 
survey sources 

The determination of level of information is often necessary as 
background to the study of attitudes or opinions It is dangerous to 
assume that i^ues and events are equally understood by everyone, 
and It IS difficuU to assess how people stand unless we know what 
their understanding of the issues is A respondent s^mformation 
level may be measured simply m terms of his awareness or unaware 
ness of an issue or event For example, does he know of the existence 
of an organization called UNESCO? tve kwow vhav. 
other than the United States participated m the United Nations 
Korean campaign? Does he know of federal regulations controlling 
installment credit? Information level may also be measured in terms 
of ihe degree of detail the individual possesses If the respondent 
knows that there is such an organization as UNESCO, for example, 
does he know how it is constituted, where it is located, what it does! 
how u w related to the U N , and so forth? 
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Questions regarding opinions and attitudes arc illustrated b) 
inquiries about what people think are the purposes, achievements, 
or shortcomings of the United Nations, how they feel about the 
competence of the federal administration, or whether they prefer 
a sales tax to an income tax. Attitudes are generalired viewpoints 
of appro\'al or disapproval. To determine the presence or absence of 
attitudes and the reasons for holding them— what public issues do 
what kind of people approve or oppose and why?— is frequently an 
important survey objective. 

Survey analysts are not usually satisfied with obtaining infor* 
mation on people’s attitudes regarding specific unrelated public 
issues: it is often more important to investigate patterns of attitudes 
and interrelations among different attitudes. In this case we do not 
ask simply how many and what kind of people approve of the 
Korean campaign, or of the Atlantic Treaty, or of sending troops 
to Germany: we want, rather, to determine whether a distinct group 
of people emerge who approve of all these and other related policy 
measures in contrast to another group opposed to tliem. In this 
'vay, general attitudes may be discerned and it may be determined 
^vhich specific attitudes arc and which are not influenced by these 


general attitudes. 

The study of motives and expectations represents one o t ic 
niost challenging areas of survey research. The motive concep 
stands not only for the stated reasons for behavior-ihat is, the 
answers to the questions ‘'why" (e.g., "Why did you vote Republi- 
can? ’)--but more generally for the forces impelling to action. 
taiions represent the time perspective of a person as it s 
"^ard into the future— that is, his opinions and attitudes a ou w 
''dl happen as well as his intentions and plans. 


forms of analysis 

Surveys directed toward a joint study of . j.g pro 

data, behavior, and attitudes are usually 
uctive than those intended to cover only jj-jes, a few 

. the many possibilities of such iniegraie 

*^portant ones will be discussed here. 
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Comparison 0 } Different Parts of the Sample 

Instead ot studying the distribution of certain kinds of behavior 
or of certain attitudes among all people, the sample may be broken 
into several groups with the purpose ot determining the difletences 
m behavior or attitudes among these groups For example the 
analysis of peoples perceptions of and actions regarding inflation 
during 19a0 1951 became more meaningful when low income groups 
were compared with high income groups Similarly, studies have 
often been made of behavior opinions and attitudes of specific 
educational or age groups Groups which are compared need not 
represent demographic classifications but miv be real groups to 
which people belong such as work groups in factories or offices 


Linking Behavior and Attitudes 

In some surveys the most critical analysis requires the compan 
son of behavioral or attttudmal groups Attitudes are thus not 
studied in the abstract but are linked to specific forms of behavior 
In a factory, for instance high production employees may be sep 
araied fiom low production employees and their aiiitudes toward 
the company or their foreman compared Or, instead of studying 
all people s satisfaction or dissatisfaction with public libraries the 
survey may contrast the atuiudes of those who use the libraries 
frc<iucnily with those who use them rarely or not at all Contrasting 
aUiiudmal groups have been established by asking people whether 
thev ft el that they are financially better or worse ofl than a year ago 
Such a division of American families has made it possible to study 
whether purchasing of 'lutomobilcs is or is not associated with the 
feeling of being better off financially (22 23, 28) 

The 'itudy of MoUtations 

In most surveys the direct question of why a certain action was 
nlcn represents one, but nor the only approach to the analysis of 
moiivaiional forces Since many aspects of the prevailing moti 
vational forces mav not be salient m peoples minds or mav 
not be rccogniicd bv them as Invmg contributed to a decision. 
It IS necessary to use a correlational ipprmcli One example of 
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unearthing motivational forces though correlation studies has been 
mentioned earlier. When people were asked during World War II 
why they had bought war bonds, they gave many reasons (usually 
patriotic or investment reasons) but rarely mentioned that they had 
purchased bonds because they had been solicited to do so. A com- 
parison of people with similar incomes and occupations who had 
and who had not been solicited revealed, Imwevcr, that many more 
of the former than of the latter had purchased bonds. 

Similarly, inquiries about reasons for purchasing automobiles 
may shed light on many important factors wliich have at one time 
or the other contributed to the decision to buy a new car. When 
asked why they bought a car during ilie preceding year, very few 
people mention that their had increased. However, com- 

parison of car buying of those who had an income increase with 
those who did not have such an increase reveals the existence of such 
a relationship (23). 

Another example is found in the analysis of factors contributing 
to investment decisions (building of new plants, purchasing new 
equipment) on the part of mantifactitrers. In a stjn’ey of lop 
executives or owners of manufacturing plants, tlte question of why 
expansion was undertaken or c.xpansion plans were entertained 
yielded a great variety of reasons. Further relevant factors tvere dis- 
cerned, however, by asking the manufacturers questions about their 
profit expectations and by studying the relation of their answers to 
the presence or absence of expansion plans. The joint use of cor- 
relation analysis and of the direct question of "^Vhy?” proved fruit- 
ful in helping to explain why some manufacturers entertained 
expansion plans though they expected stable or decreasing piofits 
(25). 


Making Predictions 

Although associations or correlations between di/Fcreni variables 
do not .necessarily show which is the cause and which the effect, 
such studies may provide infonnation useful for the study of causa- 
lion. If relationships arc csiabJisIicd. for example, between past 
income increases and purchases of durable goods, a prediction 
about future behavior in case of widespread and substantial income 
increases may be made even if the question about causation hjj 
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not been entirely clarified Detailed inquiries about the circum 
stances m ivhich behavior has taken place may result m statements 
which imply information on causation as ‘ People who expect sub- 
stantial increases in income will save relatively little 

Asking people about their plans and intentions provides another 
method of deriving predictions of things to come It must be empha 
sued however, that the translation of expressed plans into pre 
dictions IS by no means a simple process If a survey determines, 
for example, that a proportion o* all American families repTes-^niing 
three million people plans to buy a new car in a given year, the 
prediction that three million new cars will be bought in that year 
is not justified Plans are subject to change with circumstances, and 
purchases may be made by people who at an earlier time did not 
plan to make them If, however, two consecutive surveys find that 
the number of prospective automobile buyers has increased from 
three to four million from one year to the next the statement that 
the automobile market is firmer m the second year than in the first 
may be justified Expressed plans represent attitudes prevailing at 
a given time and information about them increases our knowledge 
about the situation at that tune The greater our knowledge the 
better is our ability to predict (23 28 29) 

In some cases survey data can be used very effectively to predict 
public reaction to events which are known to be forthcoming A 
survey among industrial workers at the end of 1942, for example, 
showed that many of these people had no cash or other assets except 
government bonds (48) Since tlie survey also showed that many of 
these workers were making incomes on which they would have to 
pay income lax it was easy to foresee that in March 1943 they would 
have to cash some of their government bonds to meet their income 
tax obligations A series of predictions of this kind regarding public 
purchase and redemption of war bonds was made during World 
WzT II by a program of survey research sponsored by the U S 
Treasury Department 


VARIETY OF FIELDS OF APPLICATION 

From the diversity of the examples of survey content presented 
in the foregoing pages, it is apparent that the survey method is 
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applicable to various fields and scientific disciplines It may be 
argued that certain surveys belong in the realm of psychology, others 
m sociology, still others in economics, or political science, or public 
health Although such classificauons may be justified from a certain 
point of view, it must be emphasized that the survey method is 
essentially interdisaplinary or, to put it more accurately, that it 
contributes to the integration of several traditionally separate 
disciplines 

Among surveys which are primarily in the domain of social 
psychology or sociology are investigations of group belonging, of 
leader follower relations of family life, of occupational choice 
Studies of income, expenditures, and savings may be classified as 
economic surveys Studies of voting, of participation in poliucal 
movements, and of the distribution of political attitudes may be 
thought to belong to political science Surveys of the inadence of 
illness, uses of medical services, or the nature of public beliefs 
regarding health concern the field of public health 

Such classifications, however, are someivhai arbitrary All sur- 
veys, e\en those m economics or public health, have something to 
do with people’s behavior If we ask about level of information, 
opinions, or attitudes, we are primarily interested in finding out 
how and why people behave as they do If we omit enumeralive 
surveys of the census type (intended, for example, to determine 
the number of employed workers or the number of farmers raising 
\\heat in the United States), we may conclude that all surveys have 
some relation to psychology They arc concerned fundamentally 
with people’s behavior— their social behavior, their economic be- 
havior, their political behavior, their health behavior Although 
statistical reliability requires the grouping of individuals, survey 
data ahsays derive from individu'il reports Finally, psychology 
enters into the picture because of the method of surveys the con 
tact between respondent and interviewer represents an interpersonal 
relationship of the kind that has histonnlly interested professional 
psychologists 

It IS not permissible, however, to classify sur\cy research as a 
part of psychology Survey research has no speafic disciplinary 
anchor point It is being used by specialists m all fields of behas loral 
science, being adapted in eacli case to the requirements of that 
field Survey data are broadening the empirical base of a variety of 
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fields They are also providing ihe nw materials for ai 
volume of cross disciplinary analysis which, it may be 
serve in time to help bring about a closer integration of t 
separate behavioral sciences 

Depending on the intent of the survey planner, surve)s can be 
a tool of applied research or can have a function in tvhat is consid 
ered baste research When data are collected which polic> makers in 
business or government need for their immediate practical purposes, 

It IS customary to speak of applied research Tlie survey technique 
has been widely used by businessmen in their research on markets, 
consumer pieferences buying habits and the like Numerous large 
commercial research agencies are continually engaged in surveys of 
this kind for American business Various branches of the federal 
government have also found reason to conduct applied research of 
this kind usually to assess public response to their programs and 
to find ways m which the programs can be improved (4) 

The use of the survey method as a basic tool of the behavioral 
saences has been discussed in the preceding section A great vTinety 
of hypothesis testing has been done through surveys for example, 
sn studies of the relation of economic experiences and expectations 
to spending and saving behavior (i2, 25) the relation of personal 
frustration to aggression against minority groups (5) the relation 
of identification with political patties to attitudes regarding political 
issues (2) and the relation of distance to the transmission of rumor 
(14) 

The distinction between basic and applied research though 
clear cut in some instances often proves to be superficial What is 
called basic research may have more significant practical uses than 
a great deal of applied research Suppose the survey technique is 
used to study the dynamics of inflation— as it is, m tact being used 
If reliable information were available about the factors which 
induce businessmen and consumers to stock up or to buy in advance 
and in excess of their needs our basic knowledge of economic 
behavior would be greatly enhanced At the same time practical 
applications regarding anti inflationary policies would emerge from 
such findings The discussion of the probable effects of new policy 
measures could eventually be removed from the sphere of hunches 
and guesses and could be based on scientific evidence 

Similarly the extensive surveys conducted among members of 
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the armed forces during World War II may be regarded, from one 
point of view, as applied researdi. Some of these surveys— for exam- 
ple, those concerned with the attitudes of and toward Negro 
soldiers— were intended to assist policy decisions regarding the 
integration of Negro and white troops (42). At the same time, these 
studies can correctly be regarded as basic research, since they serve 
to clarify the possibilities of clianging attitudes through personal 
experience and manipulation of the environment. 


FLqW CHART OF A SURVEY 

The sequence of the tasks involved in carrying out a survey, 
from the Srst stages of planning to the preparation of the final * 
report, is presented in this section. 

1. General Objectives 

The problems which make a survey necessary and the general 
objectives of the survey are stated. This statement is usually ex- 
pressed in broad terms and defines only the general area and scope 
of the project. 

2. Specific Objectives 

Although the general objectives, usually few in number, are 
formulated without regard to the requirements of the survey tech- 
nique, these are considered when the general objectives are broken 
down into the usually numerous specific objectives. The specifica- 
tion of all data to be gathered and of the hypotlieses to be tested 
by the survey is accomplished at this stage. 


3. Sample 

Two decisions must be made regarding the survey sample: (1) 
what the universe of the survey is to be (will it be all American 
families, or all employees of a factory, or all the physicians living 
in a region of the country?) and (2) the size and design of the 
sample which is to be drawn. After these decisions are madc,*the 
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actual drawing of the sample units and the preparation of deline 
ated maps, block lists and the like may proceed 

4 QuesttonnatTc 

The method by which the sample is to be conlacied (by personal 
interview, telephone, or mail) is determined, and a qucsiionnavre is 
prepared The questionnaire is not simply a translation of the 
specific objectives into language understandable to the tespondents. 

It IS built carefully, with regard to the type of questions to be asked, 
the degree of probing the sequence of the questions, and the estab 
lishmeni of rapport The draft of the questionnaire is 'pretested in 
the field before us actual use. 

5 Field Work 

When personal interviews are to be conducted, intcrvienrers 
must be trained both m general mtervipumg procedures and m 
questions specific to a giien survey Interviewers are supplied with 
an instruction manual uhich explains the objectives of the study 
and the meaning of each question Provision is made for careful 
supervision of the interviewing 

6 Conient Analysts 

The data obtained in a survey may be so simple that the inter 
views received may be easily and directly transcribed into tabula 
lions (or into punched cards through which tabulations are made) 
But even surveys of the census type require careful editing and 
attitude and opinion surveys require content analysis This is done 
by preparing a code a numbered list of major items subsuming all 
the responses received to each question Coders must be trained, and 
coding must be supervised and its reliability established 

7 ,/4naIysis Plan 

The questionnaire of a large scale survey may contain 50 lOQ 
or more questions It would be very inefficient to tabulate the rela’ 
tionship between the responses received to each question The analy 
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sis plan, which, in case o£ the use o£ machine tabulating equipment, 
results in writing out “machine requests," contains the machine 
runs which are needed to test the hypotheses enumerated in Step 2. 
This plan has been implicit in the surveyor’s thinking from the 
very beginning of the study. His anticipation of the tabular material 
necessary to answer the objectives of the survey underlay the prep- 
aration of the questionnaire and the determination of the content 
analysis. 


8. Machine Tabulations 

The results of the coding process are used to prepare punched 
cards, and the tabulations foreseen in Step 7 are carried out. 


9. Analysis and Reporting 

The data are analyzed, their reliability is determined, and a 
report is ssrritten embodying the survey findings. Sometimes, in the 
case of administrative or applied studies, survey findings are used as 
the basis of conferences with policy-makers for the interpretation of 
the implications of the research data for action decisions. This type 
of reporting is sometimes called “feedback." 

This scheme of the sequence of survey work should not imply 
that the nine steps are independent of one another. Some of the 
steps listed in succession are usually carried out simultaneously— 
for example, the code may be prepared before the field work, and 
content analysis 'may be carried out at virtually the same time as 
interviewing. The survey process is a highly interconnected chain of 
events. Decisions at each step must be congruent with what has gone 
before and must anticipate what will follow (32, 34). 


RELIABILITY ANp VALIDITY 

The reliability of surv'ey data can be measured in the same way 
as the reliability of any other kind of research data, by retest. Both 
individual and aggregate scores can be analyzed in this ivay. 

Unreliability in sun’ey data results from a combination of dif- 
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terent lypa o£ error Interviewing error arises Ironr inconsistencies 
m the way m which the interview is conducted Reporting erro 
results Irom vagaries of mood or attitude on the part of the respond 
ent Sampling enor is implicit whenever a sample is tahen as rep 
resentative of a universe Errors in coding, tabulation, and analysis 
make their inevitable contribution to the total Anything whiffl 
tends to create dtflerent results under theoretically identical condi 
tions may be said to be coninbulmg to unreliability 

The rehabihiy (or consistency) of the information given by an 
individual respondent can be assessed either through related ques 
tions in the same interview or tlirough the same question in sue 
cessive interviews In the former case, for example, tlie rcliabihl) 
of the reporting of personal financial data can be estimated by 
balancing the income and expenditure data which are given in 
individual interviews In the same ways respondents’ reports of 
their age can be checked against their report of their educational 
and employment hwtory Questions intended to measure degree of 
information can be ordered in a scale of diffiailty and the extent 
of inconsistency of response within induidual interviews noted The 
reliability of attitudmal responses can be assessed in a similar way, 
using scaling procedures of the type developed by Guttman (41) 
Such measures involve some ambiguity, since they reflect not only 
individual unreliability but also failure to adiievc unidimensional 
scales 


The measurement of the reliability of report by the comparison 
of individual responses in successive interviews involves special prob- 
lems If the two interviews follow each other within a very short 
time interval it is easily possible that the respondent will remember 
his earlier answers and simply repeat them verbatim in order to 
appear consistent On the other hand, if a long nme interval elapses, 
error may be introduced by the respondent s inability to remember 
the data he is asked for This latter problem becomes aggravated 
if the datum which the respondent is asked to report is like income 
for a speafic year, a point in a changing senes 

It has been shown that some iietm-such as age religion, and 
country of ongin-are reported wuh a very high degree of’ con 
sistency (over 90 percent idenuty) over periods of a year or more (6) 
It has also been shown, however, that reports of annual income 
made a year alter the end of the reported year often depart sub 
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stantially from the report gi\en immediately following the reported 
year and that these dcvijtions tend to be biased m the direction of 
the individuals income change during the )ear following the ong 
inal report (50) 

Memory errors of this kind are so pervasive that there is a 
strong tendency in survey studies to reduce to a minimum the period 
the respondent is asl cd to recall Income data are usually requested 
on a yearly basis, and preferably shortly after the end of the calendar 
year, when income tax obligations compel most people to total up 
their years earnings The Survey of Sickness, as we have seen, uses 
a reporting period of two months Surveys of radio listening cus 
tomanly ask for a report of only the previous day Generally speak 
ing, the more ephemeral and less eventful the experience to be 
reported, the shorter the reporting period 

The reporting error of attitudinal responses ts difficult to assess 
m successive surveys because of the problem of difTcrentiating true 
change which may have taken place between surveys from simple 
inconsistency of report For example, the same question ( Constd 
enng the country as a whole, do you think we will have good times 
or bad times or what during the next 12 months’ ) was asked twice 
at the beginning of 1948 and at the beginning of 1949 From a 
sample of 655 identical respondents, it was found that 41 percent 
gave the same answers and 18 percent gave radically different an 
swers (22) The distribution of the aggregate scores was quite similar 
at the two dates Nevertheless, some of the consistency (41 percent) 
may have been due to chnnee and some of the change (18 percent) 
may not have been true change There has not been suffiaent re 
search reported to justify any gcncrahzattom as to whether certain 
types of attitudinal responses are more or less persistent than others 
or are reported with greater or less reliability by individual 
respondents 

It IS customarily found that the consistency of averages or 
frequency distributions is greater Uian that of individual scores 
In Withey’s study, for example, the distributions of 1947 urban 
incomes were found to be quite similar in surveys conducted early 
in 1948 and early in 1949 (the same respondents were interviewed 
in both survep), although tlie individual scores which make up the 
distributions were far from perfectly correlated as Table 1 indi- 
cates (50) 
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TA#l* * 

C.mpTl..n ot B»<= OM=ln.<l 1» t»<> S””””''" 

Prol>oTUon »n /nitcatcd 
Bracket m First Suney 


m 


/ 

1947 Money 
Income Before 
Taxes 

from Survey 
Conducted tn 

Early Early 

ms 1949 

Profortton 
tn Same 
Bracket fn 
Both Surveys 

Adfocent 
BracKet In 
Second 
Suruey 

Nonad/ocent 
BracAel (n 
Second 
Survey 

Under ?! 000 

51 OOMl 999 
2000- 2 999 

3 000- 3 999 

4 000- 4 999 

8% 

14 

23 

22 

14 

7% 

14 

28 

18 

13 

6% 

10 

17 

12 

6 

4 

5 

9 

6 

0% 

1 

2 

2 

5 OOO- 7 499 

12 

18 

7 

4 

1 

7 500 and over 

7 

7 

6 

1 

* 


ib^o 

100% 

64% 

lb% 

6% 


• Lew than one half of I percent 

SAMru 415 identical urban spending units who were InierMcwed or e in carl) 
194S and once in early 1949 and who both times gave Information about 
their 1947 income 

sovaCE Surveys of Consumer Finances conducted by the Survey Restarch Ceniet 
for the Federal Reserve Board 

The reliability o£ frequency distnbulions from comparable but 
independent samples can be readily demonstrated by splitting the 
total sample of a survey into randomly selected subsamples In 
Table II the reported income of some 1200 respondents interviewed 
m a national survey of attitudes toward big business is compared 
to the distributions obtained from combining the reports of every 
fourth respondent (17) The consistency which appears is not un 
characteristic of survey data based on careful sampling and inter 
viewing methods 

The high degree of comparability of distributions from sue 
cessive surveys can be demonstrated not only for demographic data 
such as income or cducauon but also for psychologiail dau This 
is well illustrated by Cartwrights data on bond buying (13) The 
reasons people gave for the Governments interest in selling bonds 
during World War II occurred in very similar proportions in one 
survey after another (Table III) 
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TASiE II 

income Dtitrlbutlons of Systemoticelly Selected 
Subsampics of a National Sample 



A 

B 

C 

D 

Total 

$ 0~s 999 

11% 

8% 

10% 

9% 

9% 

1,000- 1.999 

11 

13 

12 

9 

11 

2,000- 2,999 

23 

18 

19 

21 

20 

3,000- 3,999 

19 

19 

20 

20 

20 

4,000- 4,999 

11 

14 

11 

IS 

12 

5,000- 7.499 

12 

15 

16 

14 

14 

7.500- 9,999 

5 

4 

4 

3 

4 

10,000 and over 

4 

3 

4 

4 

4 

Don’t know 

I 

2 

1 

2 

2 

Not ascertained 

3 

4 

3 

5 

4 

Number of 

io^7o 

100% 

100% 


100% 

interviews 

317 

308 

299 

305 

1,227 


TAOtE III 

Reasons Attributed to Government for Wanting to Sell Bends 



Jan. 

June 

Nov. 

June 

Reasons 

19H 

1944* 

1944 

1945 

To finance the war, to win the war. 




to help soldiers 

65% 

65% 

67% 

68% 

To prevent inflation 

14 

15 

15 

14 

To get people to save 

7 

S 

7 

10 

To provide postwar security 

2 

■ 3 

2 

3 

Other reasons 

12 

9 

9 

5 


Too% 

m% 

Too% 

100% 

Number of interviews 

1.441 

1,925 

2,148 

2.265 


Despite tlie many opportunities for errors to enter the sui^'ey 
process, there is no doubt that when surveys are conducted with 
proper obser\'ation of the basic tenets of research, many types of 
data can be collected with not only tolerable but reassuring 
reliability. 
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The validation of survey data often presents serious problem! 
The customary procedure for establishing the validity of '"casure 
ments made in social research is through a comparison with a 
outside criterion Unfortunately, there is not always an acceptable 
criterion available when survey data are gathered The survey is 
likely to have been done precisely because there were no relevant 
data at hand 

Validation on an individual basis can be achieved b) direct 
comparison of the information given by individual respondents 
with records available from other sources In a study reported by 
Cahalan, examination of courthouse records, automobile regisira 
lions, and other official documents showed a high correspondence 
with the report of individual respondents, although some items 
showed greater discrepancy than others (3) Not all differences in 
such a comparison can be regarded as survey imccuracies because 
official records are often not entirely up to date or complete and 
may not yet show changes which ate reported m the survey 
A similar kind of validation can be made when respondents of 
known characteristics are selected as the sample and they are sub 
sequently interviewed regarding the characicnstia in question 
Hyman has reported such a study in which a sample of people who 
had cashed government bonds during World War II were inter 
viewed to determine the uses to which this money was put (20) 

Seventeen percent of these people denied that they had cashed 
any bonds This relatively high discrepancy reflects the problem 
encountered when people are asked to reveal information which is 
not entirely to their credit 

A common method of demonstrating survey validity consists of 
comparing survey distributions with comparable distributions from 
the preceding Decennial Census Virtually every survey reports dis 
ttibutions o£ the demographve cbaracienstics of us sample— age, 
sex, race education and occupation— and comparisons with Census 
data can, at tlie least, illustrate the absence of gross errors 

A more rigorous test of survey validity consists m comparing 
aggregate data derived from sample surveys with independent ouuide 
estimates of the same magnitudes Unfortunately, however, relatively 
few aggregate data derived from surveys are directly comparable 
with outside estimates and, most commonly, tf e comparison requires 
complex adjustments for differences m concepts or coverage How 
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problems begin with getting access to persons as sources of data. It 
has too often gone unrecognized that the problem is not just one 
of avoiding refusals, whether by doorstep respondents, by student 
subjects, or by representatives of oiganizations. Such motivational 
.factors on the part of the interviewees as those noted by Charles j 
Cannell and Robert L. Kahn in Chapter 8 must be taken int 
account not only by interviewee but also by analysts in interpreting 
responses. In a very literal sense, moreover, the conditions undei 
which clients and respondents first agree to participate in an inves- 
tigation determine the nature of the eventual findings. Both respond- 
ent and investigator, whether they know it or not, are taking roles. 
The respondent’s initial structuring of this role relationship will 
influence, in conscious and in unconscious ways, both the fullness' 
and the content of his later responses— just as initial orientations 
toward any object influence later behavior in relation to it. As 
Ronald Lippitt and Rensis Likert note in Chapter 13, the process 
of research planning must include these facts of social life. The 
investigator has made a research dedsion, whether he knows it or 
not, when he first approaches a dient, a subject, or a respondent, 
and he sets the stage for later decisions as he continues or modifies 
the initial role relationship. 

One aspect of the role relationship between investigator and 
subject is an ethical one. In Chapter 3 and 4, some of the uses of 
temporary dissembling are noted, together with the responsibilities 
imposed upon the investigator who uses them— responsibilities to^ 
colleagues, who may later have to pay a heavy price for his laxity, 
and to the “consumers" of his research findings, as well as to the 
subjects or respondents most directly involved. The last of these 
obligations is probably most easily met; as Chapter 4 suggests, most 
subjects accept without resentment the fact of having been duped, 
once they understand the necessity for it. Nevertheless, frequent use 
of what is regarded as deceit may lead to community-shar^ expec- 
tations which undermine the necessary relationship of confidence 
between investigator and subject. The attitudes of subjects recruited 
from such a community may be such as to influence their responses 
in ways quite unsuspected by the investigator and thus, perhaps, 
m\’alidate his findings or his interpretations. 

On a fundamental level, every sodal rcsearclier, whether or not 
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the govevnniem to provide jobs for everyone is much higher among 
low-income people than among high, that intentions to buy durable 
goods are higher among people who expect increases in income than 
it is among those who expect decreases, that expressions of hostility 
toward Negroes are more common among people in Southern cities 
than among people in Northern cities, that people who disapproved 
of the Taft-Hattley Act in 1948 were much more likely to vote 
Democratic than those who approved it. These findings are con- 
sistent with our general expectations based on other information 
about our society and its functioning. This is by no means as con- 
vincing a validation of survey data as an established outside cri- 
terion would provide. In the absence of such criteria, however, 
analysis of the internal logic of survey data can often present an 
impressively consistent picture. 

When survey data are used for purposes of prediction, addi- 
tional questions of validity are encountered. A respondent’s state- 
ment in October of his intention to vote may be a valid expression 
of his inclination at that time but an invalid indication of his actual 
vote in November. A consumer's expression in January of his 
intention to buy a car during the ensuing twelve months may be 
a true representation of his intern, but there arc many unforeseen 
contingencies which may prevent him from carrying it out. The 
fact that there is not exact correspondence between intentions ex- 
pressed by individual responder)ts and their subsequent actions does 
not mean, of course, that trend data derived from repeated. surs’eys 
of intentions cannot be of value. For example, the trend of inten- 
tions to buy durable goods (expressed as proportions of the total 
sample) determined in the annual Survey of Consumer Finances 
has proved to be indicative of the trend of subsequent purchases. 
Schweiger has shown, also, that this correspondence of trends holds 
good not only for the population at large but also for various sub- 
groups of the population, for high-income people or skilled workers 
for example (37). 


LIMITATIONS 

The foregoing pages have iUustrated the scope of the survey 
technique. It is clearly a research instrument of great versatility. 
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tion although the kinsey studies of sevual behavior indicate tint 
under some circumstances a short inieniew can bring out personal 
information which is ordinarily carefully concealed (27) 

A sample survey designed to represent a population dispersed 
o\er a wide geographical area is likely not to give an adequate 
representation to any population characteristic which is highly 
locaUied This means that the influence of specific local social 
groups far example cannot be assessed through the usual national 
sur\ey, since it is unlikely that more than a tery few members of 
any such group would be caught in a cross section sample The 
study of local community factors requires a concentration of effort 
on lire specific community rather than the broad dispeision sshtdi 
vs desirable when a widely scattered population is to be represented 
It is impossible to analyze adequately the complex fabric of 
social organization through the survey method alone because the 
process of sampling tends to lift the individual respondent out of 
his social context Other methods are better adapted to the study 
of the countless interconnections which give society its integration 
It IS apparent that the survey method is not well suited to studies 
of historical development Ordinarily survey reporting refers iq a 
specific point in time or to a relatively short time period Studies 
of origins and long term developments require research methods of 
a more longitudinal character 

The most obvious limitations of the survey procedure arise from 
the fact tliat it almost inevitably requires a considerable investment 
of manpower and time Small scale surveys of highly localized and 
accessible populations can of course, be successfully carried out by 
a single individual assuming he has the requisite skills and dili 
gence If time la not a pressing consideration a small group of 
researchers can carry through projects of considerable propor 
tions, as evidenced by the Kinsey survey More commonly however, 
surveys are conducted by groups of social science technicians’, 
sometimes several hundred on a single project This may include 
specialists in study design sampling questionnaire construction, 
interviewing, coding machine tabulation and statistical analysis 
The technology of surveying has become so complicated that profes 
sional training m this field (or at least advice) is virtually mandatory 
if the many pitfalls which await the untrained are to be avoided 
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The ponderous cliaracter of many surveys imposes a further 
restraint on the researcher which he may find irksome. ^ 

design of a survey is set. the survey must be carried through accord, 
ing to those specifications. This means that it may ta e mon 
before a specific hypothesis can be tested, and each new 
of the study design intended to carry the theoretica eve op 
further will require additional months. To the laboratory re^ar 
accustomed to varying his e.Kperimenta! design ever) mont 
this slow pace may seem an intolerable frustration. 

The full contribution of survey research to the 
of the behavioral sciences, however, can be achieve on y 
continuing programs of research extending over a peno ° 

The value of such programs is not primarily m t ^ 
repeat the same observations at different points m time. , 

important is the opportunity they provide for the “PP ‘ 
an integrated framework of theory to diverse aspects ° tested, 

for progressive revision and improvement of the >p 
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this development has given an important stimu u |j 

tative study^f social phenomena. There is re-n" j 
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'ithcr in scope of applications or in precision of metliou . 
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lage over a discipline ivliich skips ibis imporiani part of the 
scientific process. When the practicality of the additional costs of 
controlled data collection in the social sciences is established, we 
may witness revolutionary developments in these fields. 


THE RELATION OF FIELD STUDIES 
TO SURVEYS 

Although it is not easy to draw a fine logical distinction between 
a survey and a study of a field situation, there are practical differences 
which call for somewhat different techniques and skills. The differ- 
ence is roughly between the greater scop_e of the survey and the 
greater depth of the field study. Alor^precisely, two essential dis- 
tin'Cirons'can be’made. In the first place, the survey always attempts 
to be representative of some known nnivefie"and thus attempts, 
botfHn"nre“number of cas'esTric lu ded and “in^'the manner of their 
selectrdnTlV'be^ adequatelylind' faithfully representative of a larger 
population. This'emphasis on sampling may or may not be found 
in a field study, which is more concerned with a thorough account 
of the processes under invcstlgaiion than with their typicality in a 
larger universe In a survey wc nlwa)s ask about the relative inci- 
dence, or distribution, of social vaiiables or personality cliaractcr- 
istics in the larger group with which wc arc concerned. The 
explanation of iicnds in population incicase, of economic booms 
and dcpiessions, of the amount of unemployment, of soiial change 
generally, must be understood in the context of the country as a 
whole, so sampled that tlic many subgioups are properly represented 
and the relative weighting of factors, as they contribute to the total 
outcome, is^ascertained. 

A second and more important diffciencc is that m the field 
investigation we attempt to study a single community o7 n single 
group in terms of its social structure -t.c., the interrelations of the 
parts of tiic suucturc an<l of the social intci'aciion taking pl.icc (2). 
The survey, to the extent that it dc.ils with suih interrel.itions and 
infection, does so llirough a study of the fin.il ouuome. 7 he on- 
going social and psycliological processes are infcrretl in the survey 
from their statistical cntl-cffccts. In llie field study, huwever, attempts 
arc made to observe and mcasutc the on-going proccMcs more 
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Field studies are similar to nation wide surveys in that they 
have opened up new possibilities for the development of social 
psychology and the social sciences. On the one hand, they are 
breaking down the narrow walls of the traditional experimental 
laboratory in the application of a research approach to com plex 
problems of h uman relationships. The effect is twotoldJ^(l) our 
scientific Ttnowl^ge is increasing as a result of the direct study of 
field situations and (2) the psychological laboratory is beginning to 
include in its experimentation social and group variables. 

On the other hand, the potentialities of surveys and field studies 
for the nonlaboiavory social sciences are even greater. These disci- 
plines have long dealt with complex and significant social problems, 
but they have had to rely either on uncontrolled observation or 
on data collected for practical rather than scientific purposes. Thus 
they have dealt with secondary sources, such as crime statistics or 
census materials, in which their research designs are imposed upon 
data already gathered. Field studies and surveys permit the intro- 
duction of controls and of research objectives into the data collec- 
tion itself. This means that both the problem under investigation 
and the i)pcs of obsersations and measures to be taken can now 
be under the control of the social researcher. A science which can 
gather its own data according lo its own research interests, in addi- 
tion to availing itself of existing records, is at a tremendous advan- 
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processes at work in a certain social class, a drive toward power 
rather than toward economic security in the top economic brackets. 
The confirmation of this hypothesis, however, is easier to achieve 
by working with a subgroup of the national population. On the 
other hand, after definite motivational processes have been studied 
in detail in a subgroup, the conception of the variables may be so 
sharpened and their measurement so facilitated that it may now 
be possible to utilize measures of these variables in a national survey. 


TYPES OF FIELD STUDIES 

There are many types of field studies, but the method has been 
tnost widely used by the anthropologist in his study of primitive 
societies. The sociologist, influenced by this type of natural observa- 
tion, has made detailed studies of paits of his own society and has 
often been able to add some degree of measurement to the more 
interpretative anthropological approach. Finally, the social psychol* 
ogisi has emphasized the imf)oitance of quantification and verifica* 
tion of observation even in studies conducted outside the laboratory. 
The most important dimension, then, on which field studies can 
''sry is the degree of measurement they represent, ranging from 
the extreme of the interpretative anthropological description of a 
primitive society to an investigation employing standardized quan- 
tification of data collection in the form of observational scales for 
recording behavior and attitude scales for the measurement of 
beliefs and feelings. 

An illustration of an anthropological field study which attempts 
^ functional analysis rather than a sheer descriptive account is to 
be found in B. Malinowski's investigations of the Trobriand Island- 
‘^*'5 (16). Malinowski lived among this Melanesian people, observed 
iheir activities at first hand, and spent many hours talking to a num- 
ber of native informants. He presents botlj a detailed description 
^nd a sociological explanation of their economic activities, tlieir 
social organization, their myths and ideologies, and their psycholog- 
>cal character structure. He reports, for example, the central role 
®f reciprocal obligations in their economic, legal and ceremonial 
life. 

The islanders, living near the shoic, will have partners in the 
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directly Specifically, this means that the field study either attempu 
observations of soaal interaction or investigates thoroughly the 
reciprocal perceptions and attitudes of people playing interde 
pendent loles Thus, a field study will provide both a more detailed 
and a mote natural picture of the social interrelations of the group 


than does the survey 

Studies of attitudes toward labor management problems fur 
msh an example of these two types of approach A national cross 
section survey would report the incidence of attitudes toward labor 
and toward management for the nation as a whole and would seek 
to get some account of the distribution of these attitudes among the 
subgroups in the population From such a study we would know 
what are the typical attitudes of workers all over the nation, of 
workers m unions as against workers not in unions, of farmers, 
of middle class groups and of owners and industrialists A field 
study concerned with the same problem might deal with a single 
plant and would examine the soaal structure of both the union and 
the company One focus of the investigation might be the power 
structure and the patterns of influence and communication within 
the union, a second focus might be similar relations within the 
company, still a third would be the relations between the two struc 
lures Systematic interviewing would attempt to get at the reciprocal 
perceptions and attitudes of workers, foremen stewards, and higher 
offiaals, and some observation might be planned of the interactions 
occurring in the factory between worker and foreman, worker and 
steward, steward and foreman, etc 


Obviously, the field study and the national survey are not so 
much alternative ways of studying problems as they are supplemen 
tary procedures which can be used most effectively in combination 
There are two major adi-aniages in using both methods in the same 
problem area First, we know more about the degree of generality 
from the findings of the field study if we know how the specific 
situation studied fits into the national pattern If we knew, for exam 
pie. how the people of YankceMlIe. studied by the Warner group, 
compare with the population as a whole, we could interpret the 
findings more wisely (22) Secondly, the national survey and the 
field study each produce findings for hypotheses which can be more 
adequately tested by use of the other approach For example, a 
national survey on class structure imy suggest specific motivational 
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not only for the current period of the study but for earlier years 
as well. Similarly, past and current files of newspapers were con- 
sulted, and histories of the state, county, and city were studied. 
Considerable emphasis was placed upon understanding the com- 
niunity in terms of its history. Diaries were examined for the early 
years to supplement the other historical records. Moreover, where 
no statistical materials were available, the field staff compiled data 
on wages, steadiness of employment, club membership, church 
attendance and membership, attendance at motion pictures, etc. 
Ihis introduction of quantitative materials was made more rigorous 
y carefully planned interviews with a sample of working-class 
'vives and a sample of businessmen's wives. Written questionnaires 
'vere also sent to more than 400 clubs and to three fourths of the 


senior high school population, 

^ From this mass of data and source material, the Lynds derived 
I account of the major activities of the community, the trends in 
pattern of conflicts in its life. They found a 
’ . rate of change in the performance of various basic 

Activities, with the greatest changes in the economic pursuits of 
getting a living. The general pattern of change was from the 
business class to the working class, with the working class ofien 
Rowing today the habits of the business class of a generation ago. 
Ihere are instances, however, when this process is reversed and 
practices of the lower income groups have been taken over by the 
‘^Pper income groups. In general, the currents of change seem 
erratic, with respect both to direction and to the differential rates 
Of different aspects of life. The difficulty of maintaining some sort 
j equilibrium under these conditions of stress emerges as a major 
Problem. One solution which does appear, though it is not systemati- 
uy utilized, is a sidling procedure in which an innovation first 

. - and then grad- 


s as an optional alternate mode of adjustment 


the older mode. . . 

Jhc Lynds have gone beyond the traditional anthropological 
q.Ip ''‘il^ing quantitative techniques to supplernent their 

: uaiive materials. Their statements about the activities o _ 


-- ' -“‘-■•s.* 4ai». X IICII aitliciijv.***-* -.-.I 

characteristic of its people aic o ten 
“3'istical tables. Their conclusions and their more inlcT ' 
i|jn P'^^ure, however, are drawn heavily from a considcratto 
qualitative information and from their own experience 
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mland villages so that there is constant tnterchange of fish and 
fresh vegetables carried out with due ceremonial and public display 
Similarly in the fishing village there » a mutual set of obligations 
%sith respect to the fishing canoe manned by a number of natives 
under the ownership of one man but with effective rights on the 
part of the crew with respect to time of fishing and a fair share 
in the catch and with duties toward the maintenance and care 
of the vessel Apparently the binding force of the social norms 
of the community comes not from legal penalties or from sheer 
conformity to custom but rather from a sustained give and take 
with both parties to an arrangement benefiting thereby and from 
the reinforcement of this relationship through ceremony and public 
display Thus a man who can present his partner with a lavish 
heap of food both ensures a good return in the future and is highly 
regarded by his fellows for his prowess and generosity Malinowski 
concludes that reciprocity is the all important principle in the social 
norms of this pnmiine society 

Interesting and plausible as Malinowskis interpertation is 
It lacks the finality of scientific generalization even for Melanesian 
society The hypothesis about the importance of reciprocity for the 
maintenance of social norms needs to be tested in systematic fashion 
m the Melanesian culture m relation to other social processes which 
may be functioning For such a systemaiic lest it is necessary to have 
some measurement of these processes It should be noted however 
that Malinowskis functional analysis has definite advantages over 
many descriptive accounts of cultures This lype of anal)sis utilizes 
theoretical concepts which point fairly directly to observable social 
interactions Hence the interpretation is readily converted into 
testable hypotheses 

A sociological application of the anthropological approach is 
found in the «Md.s{ oC. \V»t 

life of an American community on the basis of the intensive invcsti 
gallons o! a small team of field workers (15) The field workers lived 
in tlie community and partiapated as fully m its life as they could 
eniplo)ing participant obsenation as one of their major methods 
In this process mfonnal interviewing occurred frequently In addi 
lion they conducted a thorough examination of all documentary 
materials including census data niv and county records court files 
.cfiool mrords etc The mmm« MLSU-CENTRa. ~ -ad 
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I^nonal relations, so Schanck found that the fundamentaiistic 
e lefs and practices of the older religion were largely for public 
purposes. In private, individuals maintained a much more liberal 
set of beliefs and attitudes. Part of the conformity on public occa- 
sions was a matter of pluralistic ignorance, the erroneous belief held 
y many that the rest of the community felt differently from the 
way the respondent himself felt. Also important was the acceptance 
y the villagers of the leadership roles of the town minister and the 
^ lef contributor to the churdt funds. Later study showed that with- 
out the reinforcing effect of these leaders, the pluralistic ignorance 
^"0 small community svas quickly dispelled, with genuine re- 
tn accepted patterns of behavior concerned with religious 


versals ii 
^boos. 


^ This early community study which emphasized the individual 
a unit of measurement was restricted in scope and in technical 
social-psychological field study employing more 
ed measurements is Newcomb's research on Bennington Col- 
live* college community (17). Measures of objec- 

of t college were obtained from ratings of a cross section 

cla! judges who selected the most extreme individuals in each 
SuV characteristics related to community citizenship. 

measured by the individual's own view 
enr community, including her awareness of differ- 

es between self and others. Individual prestige was ascertained 
teur "°^^uation of students, by their fellows, as most worthy to 
college at an intercollegiate gathering. Finally, a senes 
attitude scales was administered to all students and was rcpf 
scalp!*”!^ students during their college careers. The 

« « dealt with public affaira which ware centn.1 to the yalu« 
to th ^^'^^unity. The results shoived that, as students 
the community, they took on the values of t c ^ ’ 

ntemb ^"“"'cdged leaders showed greater effects of such ^ P 
tnce the followers. Thus there was good corr^pond 

t">d oh-''™™ attitudes of students valued by t = “in 
indi~^u";™ P'tty^d by the student. Moreover, '“bjc"' 

Person t^bhin the objective pattern there svere i 

of ftli^tment which were the^-uj^- 

Ml, 


■los^ situation and of past methods of relating to ' 
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participant members of the community Thus not all of then 
conclusions can be readily confirmed by other .nvestigators became 
of the difficulty of duplicating the exatt procedures on which the 
conclusions ate based In general however in lU attempt to docu 
ment its observations with facts and figures this study represents 
a real advance toward more objective scientific methods What 
emerges very clearly in this attempt is the distinction betiveeii 
findings or data and interpretations winch are derived from the 
observations and data 


The pattern o£ the Middletown study m interweaving quan 
tiiatue data obtained from interviews and questionnaires with 
material from secondary sources and general observations and in 
formation about the cultural setting has been followed m niatw 
sociological studies These have often dealt with a more limited 
frame of reference than the total community and have contributed 
to our knowledge of social stratification as in the Warner studies 
(2S) the Holhngshead investigation of adolescence and class mem 
bership in a midivestern community (9) and the Jones study of 
the socioeconomic basis of class in Akron Ohio (11) An interesting 
combination of participant observation and systematic interviewing 
IS found in Childs study of second generation males of Italian 
origin (3) 

An early example of the approach of social psychologists eni 
ploying more measurement but circumscribing more narrowly the 
field to be studied is SchaneVs study of Elm Hollow (19) Schanck 
followed the anthropological tradition in living in the comraunitv 
for a period of years and becoming thoroughly conversant with its 
Mays through observation and through long talks with infoiniant« 
In addition however he used systematic inteiviewing in which 
every resident of the community was questioned m an informal 
interview The views of each resident were ascertained on the same 
standard issues Thus Schanck was able to quantify his findings and 
to Slate in more precise fashion the degree of relationships found 
It IS interesting that this detailed and quantitative approach gave 
an account of the community winch distinguislied sharply between 
the formal patterns of belief and behavior in institutional settings 
and those m more private and informal settings Just as the Mayo 
(18) group found that on close inspection the formal patterns of a 
factory are often contradicted by the informal patterns of inter 
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personal relations, so Schanck found that the fundamentahstic 
beliefs and practices of the older religion were largely for public 
purposes In pri\ate, individuals maintained a much more liberal 
set of beliefs and attitudes Part of the conformity on public occa 
sions was a matter of pluralistic ignorance, the erroneous belief held 
by many that the rest of the community felt differently from the 
way the respondent himself fell Also important was the acceptance 
by the villagers of the leadership roles of the town minister and the 
chief contributor to the church funds Later study showed that with 
out the reinforcing effect of these leaders, the pluralistic ignorance 
in the small community was quickly dispelled, with genuine re 
versals m accepted patterns of behavior concerned with religious 
taboos 

This early community study which emphasized the indiiidual 
as a unit of measurement was restricted in scope and in technical 
thoroughness A social psychological field study employing more 
detailed measurements is Newcomb’s research on Bennington Col 
lege, a self contained college community (17) Measures of objec 
live role in the college were obtained from ratings of a cross section 
of student judges who selected the most extreme individuals in each 
class on each of 28 characteristics related to community citizenship 
Subjective role was measured by the individual s own view of her 
relationship to the community, including her awareness of differ 
ences between self and others Individual prestige was ascertained 
through nomination of students, by their fellows, as most worthy to 
represent the college at an intercollegiate gathering Finally, a series 
of attitude scales was administered to all students and was repeated 
for some of the students during their college careers The attitude 
scales dealt with public affairs which were central to the values 
of the community The results showed that as students assimilated 
to the college community, they took on the values of the group, 
the acknowledged leaders showed greater effects of such group 
membership than the followers Thus there was good correspond 
ence between the attitudes of students valued by the community 
and objectne role plajed by the student Moreover, subjectne role 
indicated that within the objective pattern there were characteristic 
personality modes of adjustmen*^ which were the resultant both 
of the present situation and of past methods of relating to ones 
fellows 
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piilicipant members o£ the community Thus, not all of then 
conclunons can be readily confirmed by other investigators berause 
of the difficulty of duplicating the exact procedures on which the 
conclusions are based In general, however, in its attempt to docu 
ment Its observations with facts and figures, this study represents 
a real advance toward mote objective scientific methods What 
emerges very clearly in this attempt is the distinction hetweeii 
findings or data and interpretations which are derived from the 
observations and data 

The paiietn oi the Middletown study in interwe'tving quan 
utative data obtained from interviews and questionnaires with 
material from secondary sources and general observations and in 
formation about the cultural setting has been followed m man\ 
sociological studies These have often dealt with a more limited 
frame of reference than the total community and have contnbuieil 
to our knowledge of social stratification as in the Warner studies 
(23), the Hollingshead investigation of adolescence and class mem 
bership in a midwesiern community (9), and the Jones study of 
the socioeconomic basis of class m Akron, Ohio (11) An interesting 
combination of pacticipam observation and systematic interviewing 
IS found in Child s study of second generation males of Italian 
origin (3) 

An early example of the approach ol social psychologists em 
ploying more measurement but circumscribing more narrowly the 
field to be studied is Schancks study ol Elm Hollow (19) Schanck 
followed the anthropological tradition m living in the communiti 
lor a period of years and becoming thoroughly conversant with its 
ways through obser>aiion and through long talks with informants 
In addition, however, he used systematic interviewing m which 
every resident of the community was questioned in an inform il 
inienicw The views of each resident were ascertained on the same 


standard issues Thus, Schanck was able to quantify his findings and 
to state in more precise fashion the degree of relationships lound 
It is interesting that this detailed and quantitative approach gave 
an account of the community which distinguished sharply between 
the formal pauerns of belief and behavior in msiitiitional settings 
and those in more private and informal settings Just as the Mayo 
(18) group found that, on dose inspection, the formal patterns of a 
factory are often contradicted by the informal patterns of inter 
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be readily resolved by utilizing the anthropological approach as the 
initial stage in a field study. This stage can utilize to the full the 
advantages of seeing the situation as a whole and of attempting 
to grasp the fundamental relationships. From this study can come 
the insights which can furnish the hypotheses for later, more 
detailed, quantitative study. In fact, in the Festinger-Schachter* 
Back study of a housing community precisely this procedure was 
employed (7). In the first stages, informants and informal participant 
observation were used, as well as observations of important gioup 
meetings. Then, in the later stages of the study, systematic interview- 
ing of all housewives was carried out, and sociometric techniques 
were used to discover communication and preference patterns. 


STEPS IN THE CONDUCT OF A FIELD STUDY 

It is important, then, to turn to the steps in the conduct of a 
field study. The following model cannot be fully realized in every 
study. Moreover, specific studies often dictate their own proce- 
dures. But there is some advantage in breaking down an inves- 
tigation into its major processes. Thus, the following phases can 
be examined for their relevance to a contemplated study: (I) pre- 
liminary planning, (2) the scouting expedition, or the anthropologi- 
cal short cut, (3) the formulation of the research design, (4) the 
pretesting of research instruments and procedures, (5) the full-scale 
field operation and (6) the analysis of materials. 


Preliminary Planning 

Ideally, the field study should start with a period of research 
planning in which some tentative decisions are made about the 
scope of the study, its general objectives, and the timetable of its 
stages. As a general rule, exact formulation of research design is left 
to a later stage, when the results of the scouting expedition are 
available. Often one purpose of the field study is the obtaining of 
a better knowledge of the significant variables rather than the final 
tesang of a well-formulated theory. Even where the field study 
is a follow-up of other research, however, it is important not to 
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These four studies indicate that the conHict between the an 
thropological and the quantitative approach can be resolved, though 
the differences in methods should not be glossed over The anlhro 
pologist or sociologist who has familiarized himself thoroughly with 
a culture or community, who has lived in it, observed its people 
talked with them at great length studied its history, and immersed 
himself in all available materials can give a picture of the function 
ing group as a whole He can make insightful interpretations of its 
social processes And this type of study provides a great deal of 
information about a community or a culture with a remarkable 
economy of effort For example, Harbison and Dubin, in their case 
studies of union management relations, produced a remarkably 
informative picture of the significant variables and their interre 
lations (8) Similarly, Dollard (6) and Davis, Gardner, and Gardner 
(5) demonsu-vtcd the economical advantages of the anthropological 
method in their studies of caste and class A measurement approach 
would mvohe many more field workers and many times the time 
and energy and would still not be able to achieve the same high 
level of understanding and interpretation Moreover, the quantita 
ti\e approach, because it seeks for easily measured variables may 
focus on microscopic and trivial (actors and miss the significant 
processes m group functioning 

On the other hand, the anthropological procedure represents 
only the first step iii science because its rich interpretations are not 
based on relations which have been quantitatively established 
They are inferences which either represent a v\holistic type of judg 
ment or are based upon what the investigator regards as his most 
central observations There is little attempt at specification of the 
types of data which are tiCLCSsary for the measurement of a given 
variable Hence, it frequently makes difiicuU and often impossible 
the verification of relations by another investigator The history 
of socnl psychology illustrates the importance of the replication 
of findings in that many of its initial results have not been con 
firmed by later investigations Only when vve attain the level of 
standardizing our specifications for data can wc see the extent to 
vvluch reported findings are true generalizations 

This dilemma posed by the two approaches js more of a his 
toncal acadent than a login! necessity In m my instances it can 
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iiuensively and which specific processes are most fruitful for the 
study. But if some of these questions can be posed in advance 
of the exploratory scouting, much time can be saved. 


The Scouting Expedition 

The scouting phase of any study is the period of informal and 
relatively free investigation in which the field workers try to get 
as thorough an understanding of the important forces in the situa- 
tion as they can. During this period they either live in the group 
to be studied or make frequent trips to observe it at first hand. The 
scouting expedition is, thus, not a pretest in which already formu- 
lated instruments are given a field trial. It is essentially exploratory, 
'dih the objective of finding out what the significant variables in the 
situation are likely to be and what types of instruments may have 
to be constructed to obtain measures necessary in the final study. 

The advantages of employing more than one investigator for 
the scouting study are obvious. Not only is the single investigator 
limited by time pressures but his own biases need to be checked. 
^foreove^, with a team of field workers supplementary and com- 
plementary skills can be utilized. 

Although freedom for the investigator to follow interesting 
leads and to utilize his own ingenuity in obtaining information is 
the very essence of the scouting stage of a study, this is not freedom 
in an absolute sense of random or aimless activity. Enough is known 
shout social groups in general so that we have some knowledge of 
'he types of things to look for in most social situations. For example, 
even though the Lynds (15) envisaged their study of Middletown as 
exploratory, with no attempt to prove or disapprove any set of hy- 
potheses, they did assume that there were broad categories of basic 
social behavior that should be observed-getting a living, making a 
home, training the young, using leisure in various for^ of play and 

engaging in religious practices, and * 

activities. As a rule, the broad framesvork of types of 
"“died will vary in relation to the purpose of the ■"'“‘■f 
Nevertheless, there is much to be said for having some 
P in mind for scouting purposes to ensure the type of ro'eregc 
° information which the nature of the study may , 

The following framework is suggested as a poss.ble gn.de 
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Irene rhe design before the scouting expedition It is difficult and 
sometimes impossible to know what measures are feasible in a given 
field setting without a firsthand exploration of the situation 

It is well to be alert to the general temptation to envisage 
the study too broadly and to make an unrealistic appraisal of what 
can be accomplished within the time and budgetary limits of the 
project The scouting stage can be more valuable if there is some 
major focus and some restriction of area Another temptation 
against which planning offers protection is the tendency to accept 
a gi\en community or group for study because of its easy accessibility 
and because of assured cooperation from a few key people in it 
These are important consideiations, but they should not outweigh 
the research objectives It may be that the most easily accessible 
community is not the best place for studying the phenomena in 
•which we are interested If the suitability of the community for the 
purposes of the study cannot be decided in early planning phases, 
It can be posed as one question for the scouting stage 

There is no agreement in practice about the use of previous 
lesearch in the planning phase The current tendency is to ignore 
what has been done m the past because the researcher does not want 
to become contaminated by old concepts or because he regards 
previous studies as irrelevant and useless or because he prefers 
to use his time m his own research rather than m the library 
Undoubtedly, in the pioneering stage of any discipline where 
sophisticated methodology is relatively new, there is much justifica 
tion for this point of view of moving ahead and disregarding what 
has gone before But increasingly we shall svant to build a science, 
and this can be done belter if each investigator does not start anew 
with his own terminology and insulate himself from what otheis 
base done or arc doing This is true with respect to both substan 
tne research findings and methodological results 

It IS better to start with some general plan concerning research 
objectnes, personnel and timing than with an unstructured pro 
gram The plan, however, should allow for changes m decisions 
as a result of scouting and pretesting The fact that there is a plan 
howeser, enables the people doing the scouting to bring back 
information on the types of questions about which final decisions 
must be made It may not be desirable to decide at the outset which 
specific subgroups within a community are to be studied most 
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an industrial plant to carry out the anthropological part of the 
investigation. 

1. Contacts should not be limited to a narrow segment of in- 
formants. People arc very limited in their information by their daily 
social roles. They not only lack knowledge of the activities of others 
but they are biased by the positions they occupy. Hence it is impor- 
tant to be in communication with some person from every important 
subgrouping and every important type of social role. This has been 
the classic error in diplomatic intelligence in the past; observers 
in a legation post have moved among people like themselves, rep- 
resenting the upper social strata of the country they are studying. 
In general, when we enter a new community we tend to seek out 
people very much like ourselves. So, too, does the inexperienced 
field worker. Even interviewers with specified quotas of respondents 
to obtain will, if not checked by certain controls, bring back an 
undue number of interviews with people of the same religious and 
socioeconomic characteristics as themselves. Hence, the field worker 
should be alert to the problem of obtaining a wide coverage of 
informants. Opinion studies which have not met the requirements 
of precise sampling have often been surprisingly accurate because 
they have obtained a fairly wide spread of respondents from all 
important types of groupings. 

2. Informants who themselves have a wide range of contacts 
should be utilized. The person who by virtue of his role or his 
personality has a high rate of contact may have especial usefulness 
for the field worker. People involved in communication activities 
maintain many contacts and often have information which does 

ple, can readily describe the pattern of leadership in the community 
and identify the hierarchy of informal political bosses. 

3. Informal leaders, as well as the people in positions of form..! 
leadership, should be located and consulted. The account of the 
formal leader always needs to be checked and supplemented with 
what the informal leader can contribute. Not only is the informal 
leader often in possession of facts and interpretations not known 
to the official, but he may also be in a better position to express 
freely what he does know. For example, in one field study of union- 
management relations, one of the men at headquarters^^ holding 
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obtaining broad coverage on imporutit aspects of group ^‘•"ction 
ing It will not, of course, have equal value for all types of field 
studies, but it does reflect many of the types of variables *hich social 
psychologists and sociologists are finding significant in the under 
standing of the specifics of g;roup behavior 

1 A description of the total sinicuire under study with respect 
to the major groups and subgroupings 

2 The central value systems and goals of the total system and 
Its various groups 

3 The nature and types of conflicts and points of tension both 
with respect to the total structure and with respect to a single 
group 

4 The formal and infonnal structure and the way in which they 
are interrelated 

5 Tlie accepted pathways to group goals including 

a the logical relation between paths and goals 
b the remoteness of paths from ultimate goals or the 
number of subgoals between a groups activity and its 
ultimate goal 

c the degree of fixation upon one or two mam paths and 
and the range of permissible alternate routes 

6 The degree of autonomy of functioning of the parts within 
the total siTuciute and the nature of their dependency upon 
one another and upon the laiger whole 

7 Tlie nature of the dependency of the structure under study 
on the society or larger unit of which it is a part 

8 The power or influence patterns wulun the structure and us 
subgroups 

9 The nature of the group sanctions and the degree and bnsis 
of their acceptance by group members 

10 The patterns and channels of communication within the 
structure and the substructures 

It IS also helpful for the field worker to be trained in informal 
ways of gathering informatiod There are a number of practical 
procedures which can be followed, though their usefulness will vary 
from study to study The following procedures should be kept in 
mmd by field workers when they go into a community, a group, or 
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5. Information Irom respondents sliould be assessed in rela- 
tion to their social role and position, their group memberships, and 
their personal activities. Hence, it is important to get as much 
information about the informant’s priority of group allegiances, his 
position in the power structure of the groups to which he belongs, 
his major roles, as well as his own aspirations and goals in life. 
One principle that generally holds true for hierarchical structures 
is that the people at various levels in the hierarchy are generally 
more sensitive about the actions and feelings of those immediately 
above them than of those below them. Advancement up the ladder 
depends upon an upward orientation and an ability to relate oneself 
effectively to one’s superiors. Hence, a foreman in a plant may know 
more about the way of thinking of his immediate bosses than of his 
own men. , 

In addition to motivational biases, which need to be known 
before information can be assessed, is the factor of the amount of 
knowledge which the informant can be expected to possess on the 
basis of his contacts and experiences. Tin’s was the reason for the 
previous emphasis upon finding respondents who are very active 
and who enjoy wide contacts in the community.- One caution that 
needs to be stressed has to do with the posttional lag in information. 
The man who has moved up in a hierarchical structure knows, 
through his own experience and through the important contacts 
he had in the local group, about the lower levels from which he 
has come. In his new role, however, he has generally lost these points 
of communication. Nevertheless, he often feels he can talk accurately 
about hw former level of association. Similarly, the general in the 
army will talk authoritatively about th^ problems of his men 
because twenty years ago as a lieutenant he was close to them and 
understood their way of thinking. This informational lag is less 
true of organizations based upon functional repres^tation, where 
the leader must report back to the group which elects him. This 
procedure immediately forces contact between the leader and the 
people below him. But if the organization lacks functional represen- 
tation, the information which high-level leaders may give about 
the lower levels may be irrelevant and inaccurate. 

6. Ideally, it is desirable to spend considerable time in partici* 
pant observation. Reports of informants and information derived 
from secondary sources need to be supplemented with living in the 
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no official position could describe union policy very Irankly He 

admitted that there was some fat in the piesent group piece me 

that the company was trying to change on the occasion of the intro 
duction of a new assembly line The real issue was not the attempted 
change but the basic intent of the company in wanting change 
U the company wanted minor and reasonable modifications that 
was one thing but if this meant a new policy of pushing the union 
10 the wall, then the men would fight against the most minute 
change with all the resources they could muster And until they 
could get a better line on basic company intent they intended 
to do some experimental skirmishing with the company on the issue 
ol new standards The responsible union officials would not express 
this point o£ view so explicitly, but later events showed that it did 
represent union sentiment and union policy 

Locating the informal leaders is generally not too difficult i( 
the field worker can spend enough time m the community under 
study They are usually known to those who themselves have had 
some relation to the practical functioning of the groups in quw 
tion Those who have attempted to organize some community 
function soon discover who the key people are whose cooperation 
is essential Often the officials of a rival group can readily identify 
(he informal leaders on the other side A plant manager who wants 
to introduce a new machine often knows that he must convince 
not only the union leaders but also the informal leader among 
the old time workers And the rank and file can generally say to 
vthom they turn for advice and direction 

4 Discrepancies m the accounts of various informants should 
be used as the basis for further exploration There should be dis 
crcpancies m the information the field worker is obtaining If all 
his informants give him the same story of complex group relations 
■Kvi twsvcViiOTa, he pictoahly not covering a wide enough repre 
sentation of people m different roles and different positions The 
contradictions he does find, however, should determine the direc 
tion of further inquiries He can, by questioning additional inform 
ants find out whether the differences m the report he has obtained 
are a function of idiosyncratic perception and experience or a reflec 
tion of group membership and role differences Moreover, apparent 
coniradi^ions can be resolved by determining the frame of reference 
of the respondents who disagree 
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ative betting fraternity was giving odds of 55 to I that Dewey would 
be elected. 

The field worker needs to be aware of this type of public fiction. 
It can exist where a belief has become so prevalent that few would 
dare to challenge it. It can also exist in the area of taboo subjects, 
such as sex. The Kinsey report, although not based upon representa- 
tive sampling procedures, does raise the question of the extent to 
which accepted beliefs about sex practices are public fictions. 

8. Full records should be kept by field workers. Part of the 
discipline of the investigator is rigorous note-taking and the setting 
aside of at least two periods daily during which the notes are elab- 
orated into a full report. No matter how excellent his memory, 
the worker cannot reconstruct from his notes his original observa- 
tions after a lapse of time without some losses in completeness and 
accuracy. This is especially true in the field situation where a con- 
stant succession of similar experiences may maximize retroactive 
inhibition and inaccuracies. 

9. Initial impressions and global judgments should not be 
discarded. Although detailed documentation is the goal even during 
the scouting period, it is nonetheless true that this is also the period 
when maximum play should be given to over-all impressions. As the 
Gestalt psychologists have demonstrated so effectively, the human 
mind does grasp things as a whole. But this type of wholistic percep- 
tion tends to be neglected in our scientific efforts at precision. In 
the scouting stage, however, field workers should be encouraged 
to record their initial impressions. These first judgments can be 
surprisingly useful because the situation can sometimes be perceived 
in Its main outlines at the very start. As the exploratory work 
progresses, there is a tendency for the details to obtrude themselves. 
Therefore, there is some point for the investigators to try a summing 
up at stated intervals to make them see the whole picture again. 

10. Available records and secondary sources should be studied 
carefully, and the operational procedures for deriving such records 
should be examined. Not only are such existing materials of great 
value in the understanding of the situation but they sometimes 
can be used as measures of variables in the larger study. For exam- 
ple, a field study of an industrial situation may well ivant to inquire 
fully into productivity and other records maintained by the com- 
pany being studied. When the report is made of the scooting expedi- 
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community, participating in its aciiviues, and constanilj obserMiig 
what people actually do in specific situations There arc (rcquenily 
practical difhcuhies with pariicipam observation, since U adds 
greatly to the length o£ time the scouting phase will lake But there 
IS no good substitute for having field workers actually live in the 
community and perform some of the roles they are interevied in 
studying Empatluc understanding of the problems faced by the 
people under study cannot be obtained fullj tlirough hearing about 
experiences from others or even from direct observation Thus, if 
we are making a study of an industrial plant, the field workers will 
have a much better account of the situation if the) can actuall) 
spend some time both in the manager’s office and on the production 
line In the absence of experiences from pariicipaiion, the field 
worker should spend as much time as possible in direct observation 
He should attend meetings of organizations and obseivc people 
in group situations There are interesting discrepancies between 
what people say in uolation and the way m which the) behave 
when they are under group pressure 

7 Personalized and private beliefs should be sought as well as 
the socially accepted cUmate of opinion In an attempt to be helpful 
and objective, informants will report the accepted point of view 
about which there is public agreement 1 his public agreement may 
represent what people ate supposed to believe and say the world 
of newsprint, the publicized and semiofficial version of the slate 
of affai s Now, it is essential to know this public climate of opinion 
since It does affect social behavior But it is also important to get 
below this first level to the more private beliefs and actions of 
individuals The field investigator should attempt to get from 
respondents then own private views and their own personal be 
havior as well as the accepted climate of opin on And behavioral 
observation procedures can be helpful he-e 

In the 1948 presidential election almost everyone except Prcsi 
dent Truman had accepted the inevitability of the election of 
Thomas E Dewey The press, the periodicals people m official 
position, even professional politicians had accepted this public 
fiction Although the Gallup poll showed Dewey ahead by only 
5 percentage points, with 12 percent of the people undecided, the 
pollsters themselves were victims of the fiction So powerful did this 
myth grow through\constam social reinforcement that the conscrv 
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mcnt in the group. In this second type of study, we would develop 
detailed measures of these independent variables and would make 
exact predictions for the productivity of work groups varying in 
group standards and solidarity. We would also specify the conditions 
which have to be held constant for these predictions to be realized. 
Since these conditions may not be held constant directly, we would 
measure them to achieve some statistical control over their effects. 

Ideally, the testing of hypotheses is more suited to laboratory 
experimentation, and exploratory discovery to field studies and 
surveys. This does not mean, however, that field studies should con- 
fine themselves wholly to exploratory procedures. The scouting stage 
can often be used as the more purely exploratory part of the investi- 
gation, and some degree of hypothesis-testing can be employed in the 
larger operation to follow. Moreover, there are occasions when the 
field approach can be used for very important hypothesis-testing, 
as in the “natural experiment'* (see Chap. 3). But it is nonetheless 
true that the great strength of the field type of study is its inductive 
procedure, its potentiality for discovering significant variables and 
basic rela\ions that would never be found if we were confined to 
research dictated by a hypothetical-deductive model. Thus, the field 
study and the survey are the great protection in social science against 
the sterility and triviality of premature model building. 

It is possible, of course, to combine both exploration and hy- 
pothesis-testing in a single field study. One major set of hypotheses 
can be investigated at the same time that other materials are gath- 
ered for exploratory purposes. This has the advantage of protecting 
the study from failure if inconclusive results are^found with respect 
to the hypotheses. The exploratory materials then become the safety 
factor. The disadvantage of this compromise is that it attempts to 
combine two studies in one investigation, sometimes to the detri- 
ment of both. 

Even an exploratory study should be so designed as to provide 
as definite information as possible for a set of research objectives. 
There are at least two levels of exploratory studies. At the first 
level is the discovery of the significant variables in the situation; 
at the second, the discovery of relationships between variables. Even 
at the first level it is important to delimit the area to be studietl 
and to introduce controls into the daia-collection process. Explora- 
tory studies which do not set limits for themselves have limits im- 
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tion. it is not enough to know that productivity records exist on 
individual workers. It is essential to know what these records are 
based upon, to what degree the productivity of the worker is 
set by the pace of the machine or the assembly line and to what 
degree by his own efforts, how comparable productivity records are 
for individuals performing different tasks, how stable arc the pro- 
ductivity differences reported over time, etc. Unless the operational 
meaning of these measures is known, it is impossible to construct a 
research design which will utilize such records. 


The Formulation of the Research Design 

As the results of the scouting exploration become available, 
the design of the final study can be worked out more exactly. There 
are advantages in developing the design as the scouting proceeds 
rather than making it a separate step in a temporal sequence. This 
permits of some interaction between the possible theoretical objec* 
lives and the realities of the field situation. At some point, of course, 
final decisions must be made about research objectives and proce* 
dures for the full-scale study, and such decisions call for a thorough 
consideration of all the findings from the scouting expedition. 

Roughly speaking, studies are of two major types: exploratory 
and hypothesis-testing. The exploratory study attempts to see what 
is there rather than to predict the relationships that will be found. 
It represents the earlier stage of a science. From its findings may 
come knowledge about important relationships between variables, 
but the more definite proof of these relationships comes from hy- 
pothesis-testing. 

For example, in a field study of industrial morale we may be 
interested in the factors related to productivity. If the study were 
of an exploratory type, it would not sian with clearly defined 
notions about the relationships to be found. It would set a broad 
net and include measures of a wide variety of perceptual and mo- 
tivational factors in the hope that some of these measures would 
show a relationship to productivity. If the study were of the second 
type-namely, hypothesis-testing— we would start with a well-formu- 
lated notion that under specified conditions productivity would 
\ary directly with a given factor or factors— perhaps the group stand- 
ards of the face-to-face members of a work section plus their involve- 
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Thus it resembles hypothesis-testing in resting its case upon the 
relationships discovered rather than upon the precise use of mathe- 
matical techniques. The major difference between such an explora- 
tory study and the hypothesis-testing investigation is that in the 
former there are no specific predictions of relationships based upon 
theoretical derivations. The researchers do have hypotheses in mind, 
but these are not precisely formulated. In a study of class structure 
in a community, for example, we may start with the general assump- 
tion that a significant motivating factor in class identification stems 
from the economic role which the individual plays. But we may not 
be prepared to specify what we mean by edonomic role, or what 
other roles may account equally well for psychological class iden- 
tification. Therefore we plan our research so as to study the many 
possible types of economic role, including the part the individual 
plays in consumption, in the technical aspects of production, in the 
social aspects of production, etc Within the broad frame set by our 
research objective, we hope to find some significant relationships. 
Or, in a study of industrial morale, we may be concerned with the 
in-plant factors which are related to worker satisfaction. We shall 
include all the important aspects related to the job and the planf. 
from wages and working conditions to type of immediate super- 
vision and congeniality of fellow workers. Then, in analysis, we hope 
to find significant relationships between worker satisfaction and 
some of these in-plant factors. 

In this sort of exploratory study, the design should be so con- 
structed that measures are available for all relevant dimensions of 
the area under investigation, but the study should be confined to a 
limited type of problem. It may be that a whole set of variables, 
which have been omitted as not belonging to the area under investi- 
gation, have more to do with the dependent variable than the fac- 
tors which are studied. For example, economic role conceivably may 
not be as important in class identification as sociometric personal 
preferences, or the length of residence of the family in the com- 
munity, or the number of ancestors who fought in the Revolutionary 
War. But it is a mistake to believe that one study is going to be able 
to account for all the variance in complex social phenomena. It is 
much more effective to take one central set of variables and investi- 
gate them as thoroughly as possible than to try to study the universe 
in one piece of research. This is a widely accepted principle among 
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posed by various practical matters, some of which are not realized 
bv the investigators 

For the exploratory study aimed at the discovery of variables 
rather than relationships, factor analysis is often urged as the best 
method of finding out the unitary and independent factors in the 
situation From a design point of view, much more is known about 
the assumptions of factor analysis on the side of statistical treatment 
of materials than about the assumptions on the data-collection side 
of the process The tendency in the applications of factor analysis to 
social settings is to throw all sorts of measures of various degrees 
of precision and validity into the hopper of factor analysis and to 
depend upon statistical sophistication to grind out meaningful 
entities Controls on the data collection side, such as measures taken 
under standardized conditions and an adequate sampling of situa 
tions, are disregarded Although factor analysis is a powerful tool 
for handling statistical materials, it is of very limited use in field 
studies i^less the measures we employ in the first instance are defen 
sible The major need, then, tn the design of the exploratory field 
study IS the provision for controls in the observation behavior 
and m the recording of respondents* ideas, perceptions, attitudes, 
sociomecric choices, etc 

These controls should be concerned with standardized or com 
parable conditions under which observations are made and inter 
views taken and with measures of reliability for the data gathered 
(see Chapter 6) This implies a fair degree of specification of tlie 
cues the investigator is to use in coding the behavior he observes 
and a certain degree of structure in the interviewing situation The 
freedom of the scouting phase is over and we now need measure 
ments of the factors we are describing as important m the area 
of our study Another major requirement in this first level type of 
explomo-inj ^ relevant van 

ables in the situation This calls for thorough and even coverage 
of the many aspects of behavior which seem to be related to the 
mam problem under investigation Only under these conditions 
will factor analysis or some similar technique have a real oppor 
tunny of discovering the significant unitary factors 

In the second type of exploratory study, where the objective 
is the discovery of relationships there is less concern with adequacy 
of coverage of behavior and less inieresl in the use of factor analysis 
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of the A\'ar Department was able to measure some of the effects of 
contact under these conditions of group surs’ival, although the study 
was not a natural experiment in that its measures came after the 
event. Nonetheless, when a planned change is known, it is feasible 
to formulate hypotheses in advance and to take continuing measures 
and observations of the on-going change. The full advantages of the 
natural experiment remain to be exploited. 

• It is true, of course, that to the extent that psychologists are 
allowed to participate in a planned change by being fully informed 
and by being permitted to take detailed measures, the way is open 
for them to influence the planning of the change. Thus we have 
a bridge between the natural experiment and the field experiment. 
An interesting illustration of the possibilities here is found in the 
Curle-Trist study of the rehabilitation of returned British prisoners 
of war (4). Many of these men were having real difficulty in readjust- 
ing to their old roles in the family and community. Hence, camps 
were set up in which a group of men could live for a period and 
maintain something of the norms and roles which they had devel- 
oped as prisoners of war. But contacts with civilian life were grad- 
ually established to facilitate adjustment to the civilian world. This 
natural experiment was studi^ very carefully and was to some 
extent carried out on the advice of psychiatrists and psychologists. 

The major advantage of the natural experiment over the lab- 
oratory or planned field experiment is that the manipulation of 
variables is much more powerful. The real world can and does pro- 
duce role reversals, drastic changes in group norms, institutional 
revolutions, and group conRict in a fashion impossible in the lab- 
oratory. The design difficulties in the natural experiment, however, 
are much greater than in the laboratory experiment. We generally 
lack a control group whose comparability to the experimental group 
is assured. Hence, provision should be.made in the design for obtain- 
ing measurements on as well-matched a control group as possible. 
This does not guarantee that the experimental and control groups 
will be truly equated on everything but the independent variable. 
It does, however, increase the probabilities that the predicted results, 
if confirmed, are valid. 

A second provision to make is the detailed measurement of the 
degree to which the independent variable is manifest in the sub- 
groups under study. If, for example, in the Army experiments we 
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research workers save when they evaluate the researdi of otiicr peo 
pie Then objections are often raised to studies because they omi 
many signiHcant causal deicnninams Thus for example, to the 
extent that the California studies of ptejudice (1) arc concerned 
with the relation of personality dynamics to dlScrImlnalor^ aiti 
tudes It IS not legitimate to criticize this objective because group 
membership economic interest, etc are not investigated We make 
progress in science not by trying to solve all problems at once but 
by going at things one step at a time 

The best opportunity Cor the use of hypothesis testing is on the 
occasion of the natural experiment The difficulty with the use 
of hypotheses in field studies is the inability to determine causal 
relationships with any definiteness since most of our measures are 
not taken with respect to systematic changes in some ascertained 
independent variable Now a natural experiment is a chnnge of 
major importance engineered by pohey makers and practitioners 
and not by social scientists It is experimental from the point of 
view of the scientist rather than of the social engineer But it can 
afford opportunities for measuring the effect of tlie change on the 
assumption that the change i$ so clear and drastic in nature that 
there is- no question of identifying it as the independent variable, 
at least at a gross level 

For example, during World War II many Japanese who were 
permanent residents of the Pacific Coast were uprooted from their 
homes and communities and assigned to war relocation camps In 
his insightful hook The Governing of ^fen Alex Leighton (14) 
describes the effects of this uprooting m a specific camp Although 
no measurements were taken Leighton s observations shon ed a 


significant role reversal m Japanese American famdv structure 
When the group was part of the American society, the American 
horn Japanese had assumed the doromani position in the home and 
the community over the older family members who uere Japanese 
born a real departure from Japanese tradition With the uprooting 
and rejection by the dominant culture the leadership function 
reverted to the older people 

Our best knowledge of the effects of contact and peisonal com 
munication on racialprejudicc: comes from the natural experiments 
of the Army during the ^\ orld W ar 11 in abolishing racial segrega 
non practices m cr ta)p combat mis (20) The Research Branch 
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setting out such tables, the investigators arc bound to discover com 
plexities of a variable which need more detailed measurement 
and qualifications of hypotheses in relation to special conditions 
The less the study attempts to make specific predictions the less 
detailed the prelimmar) sketch of the tables needs to be 

Regardless of the degree of hypothesis testing of the field study, 
the design should exploit fully three of the natural advantages of 
such investigations The first advantage is that ilie field study tends 
to continue over a period of time, so that it is possible to maintain 
continued observation Thus it follows that the timing of certain 
variables may be ascertained We can do very little in e\ post facto 
analysis, where wre are dealing with variables which arc untimed 
If we find in industry that supervisors who follow democratic 
human relations skills have sections with larger productivity, we 
do not know with any assuiance which is cause and which is elTect 
We can assume from general psychological knowledge that it is more 
likely that the skills were not produced by higher productivity, but 
we are much further along scientifically if v\e can time the occur 
rence of these two variables In this case it would mean following 
given supervisors as they are transferred from section to section 
Even where the time period of the field study is not itself very long 
It can afford chances to check on the timing of factors through the 
consultation of records and the use of the memones of a number 
of respondents Unless the design clearly specifies the types of such 
measures for the timing of variables, it will be difficult m later 
analysis to pm down such information 

A second advantage of the field study which should be utilized 
IS the opportunity for the direct observation of interaction and of 
social relationships We make inferences about social process and 
social structure from surveys but in the field study we can observe 
these factors more directly If, in our study of the college com 
irumty, we are interested in the effects of group membership in 
various organizations we should not be content with getting the 
attitudes of these group members in isolation We should have 
observers at group meetings to record how people actually behave 
in the g oup situation 

A third advantage of the field study is the important resource 
of going beyond measures obtained from a single instrument The 
correlations from a single measuring instrument may be infiucnred 
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were studying the eRect o£ contact, ne would want to know liow 
many Negroes were actually introduced into a given combat unit 
In other words we need a measure of the amount of the mde 
pendent variable which is independent of the measure of its eSect 
A third provision should allow for continuing observations during 
the course of the change process Such observations can be helpful 
m the interpretation of results because they may show many proc 
esses which intervene between the initial change and the final 
outcome 

The fourth provision-and by far the most important in hy 
poihesis testing m a field study-is the degree of elaborateness and 
speaficity of the predictions which are made m advance on the 
basis of theoretical expecuiion When we are conducting an ex 
ploratory study, we are handicapped in our ex post facto analysis 
m interpreting the relationships which do appear The direction and 
meaning of such relationships can often be interpreted in many 
ways But m h>pothesis testing where we have specified clearly and 
in detail the relations we expert to find, we have a guarantee against 
the inadequate controls and the loose t)pe of measurement we 
may have had to employ The guarantee applies only to cleaT*cut 
positive or negative findings not to lack of correlation If our pre- 
dictions are borne out faithfully, then the relationships discovered 
are not a function of spurious measures or erroneous interpretation 
but are in all probability a true account oi' causal connections But 
if our predictions are neither clearly proved nor disproved, then we 
can say little about the lack of relationships since they could easily 
result from deficiencies in method Positive results are more convinc 
ing when the hypothesis has been elabo a’-’d into a set of inter 
dependent propositions Moreover, the prospects of confirmation are 
greater when such detailed predictions are set up to account for the 
differential behavior of the va»ious subgroups and the various types 
of people under varying conditions The design of the study must 
of course be tailored to obtain measurements of such factors 

An excellent check m the research-design stage in studies trying 
to measure relationships is the setting up of the tables at the time 
the design is being elaborated Especially if the study contains some 
hypothesis testing it is advisable to attempt to anticipate the nbul i 
tions and cross tabulations or correlations which confirmation of 
the h>potheses requires By actually going through the mechanics of 
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houses destroyed in the respondents town as ascertained by the 
field in\estigator Thus, the inieniew could deal with various prob 
lems of morale during war, such as confidence in leadership 
equality of sacrifice, etc , without the respondent s realizing tlie 
major objective of the interview Hence, relationships between 
morale and degree of es^posure to bombing could not be attributed 
to the halo effect of the interviewing instrument If in designing 
a field study, we limit ourselves to responses from the interview for 
all our measures of the independent and the dependent variables, 
we neglect the unusual potentialities for methodological advance in 
the field approach 


The Pretesting of Research Instruments and Procedniei 

The elaboration of the research design of the study should 
contain the specifications for the measures required Tliesc measuies 
call for such instruments as interview sclieduJes questionnaires 
behavioral scales, and forms fdr the gathering of infoMuaiiou 
Wherever the research objectives permit instruments ilni line 
already been standardized in other studies should be employed 
This use of common instruments would facilitate the comparison 
of findings from study to study Nevertheless it is still true that in 
most investigations for some time to come, we shall need many new 
instruments developed to suit the objectives of the study 

It IS essential that every new instrument be pretested befoie 
the full scale field operation Such pretesting has three purposes 
(1) to develop the procedures for applying the research instrument 
so that, ior example, the scale or sAcdu^e can he used CRettively 
with respect to the time it takes to administer, (2) to test the word 
ing of questions so that they are suited to the understanding of the 
audience, and (3) to ensure as far as is practical, that the specific 
questions or observations are really getting at the variable for 
which a measure is needed The pretesting of instruments is too 
often limited to the first two objectives, since they pose readily 
soluble problems There is alw’ays a genuine danger that the third 
objective may be slighted in favor of the interesting question, the 
observation easy to make, and the general ease of administering the 
study The novice easily becomes discouraged by the variable that 
IS difficult of measurement, consequently he shifts, often without 
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by some halo in the application ol the instrument Leatleis who 
report good communication practices with their tolloneis may also 
be the leaders svho report good communication practices with their 
own superiors This may mean not a true generality ot relating 
oneself to others in the organization but optimism or convention 
ahty m answering the interMCwers questions A field study permits 
the obtaining of reciprocal perceptions and interdependent rcac 
lions from different groups of people whose behavior is interrelated 
to make up a social structure \grccment m perception on the 
pan of people standing at various points in the hierarchy gives 
greater confidence in the validity of the report For example, when 
workers, foremen, and stewards in a department ot a factory all 
agree about whether ilie foreman or siewai d has the greater power 
m that department, we are on much safer ground than if ue had 
only the foremen or the stewards reporting on the situation This 
IS important, not only for our knowledge about this differenual 
power variable m itself but in us relation to other responses of 
foremen and workers Moreover, discrepancies in perceptions can 
in themselves be meaningful psychological factors, for ne can 
measure the perceptual distortion which people have toward some 
competing group or toward their leaders as it relates to feelings of 
hostdity, lack of communication, m>group identification, etc Finally, 
reciprocal percepnons can be put K^ether to provide a picture of 
a total structure the complexity of which we might otherwise miss 
The use of independent measures in the field stud} should not 
be confined to interviews with different subgroups or t}pes of 
people It should be extended to include behavioral observation 
and existing objective records Again the relationships that are 
found between measures obtained in these different ways are more 
convincing than if they all derived from a single instrument It is 
not so much a question of validating jatervjeu' response against 
behavior as it is a matter of assuring that real relationships exist 
between the factors that are measured In a study of the effects of 
strategic bombing upon German avilian morale in the last war, 
U was possible to obtain a measure of exposure to bombing mde 
pendent of the respondent's own report of his war experiences 
(12, 21) Two sources of objective information were available con 
cerning exposure to bombing the Air Force records of tons of 
bombs dropped on the town m question and the percentage of 
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in a situation in which tlie population is very large, to determine 
the sample in advance so that the pretesting can be earned on 
outside of the sample to be used 

In the pretesting of instruments and piocedures, it is not es^en 
tial to obtain a representative sample of subjects but it is important 
to try to include some of the mam types of people who will be 
included in the final study A pretest of an instrument on a few 
subjects from only the upper education brackets will not anticipate 
the problems to be met among those with minimum schooling 
This, of course, applies more to the last stages of pretesting Often, 
when we are first trying to develop a measure, we are less concerned 
with replicating the final field situation than we are with getting 
insight into the fundamental character of a given variable 

The pretesting of measures on populations very similar to those 
which will be used in the larger study is necessary to determine 
both the specific form of questions and of observational codes and 
the types of measures applicable to specific groups of people It has 
been demonstrated that the written questionnaire is much more 
widely applicable than was formerly assumed It has the great 
advantage over the personal interview of anonymity, which in 
some situations can compensate for the knowledge gamed from 
actual personal contact with ihe respondent Moreover there are 
great economies in the use of written questionnaires as against 
interviews But it is still true that written forms aie much better 
adapted to well educated groups The lower skill levels among 
workers not only may have little formal schooling but they may 
have spent very little time with paper and pencil materials since 
their school days It is more natuial (or them to express their ideas 
orally than m writing The exact line to be drawn in the use of 
interviews a« against written questionnancs can be determined 
through pretests It is often advantageous when employing written 
forms on a large scale, to draw a subsamplc of lespondents for per 
sonal interviewing Thus the biases in tlie one method can be 
checked by means of the other 


The rttll scale Field OjJeiatioti 

Ideally, almost all the research problems are solved before the 
study goes into the field, but unfortunately these problems do not 
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realizing the extent of his shift, from the research objecme to the 
easily obtainable information 

Another difficulty with de\eloping measures for the variables 
designated in the design is the lack of a good criterion for deciding 
when we have a good measure Where it is not feasible to validate 
the measure against an objective mtcrion, the following procedure 
is recommended The instrument should be tried on a number of 
cases and the materials obtained should be coded by a number 
of judges The judges are instructed about the meaning of the 
variables, and it is their task lo make independent assignments of 
the interviews or behavior records to ordered categories expressing 
amounts of the variable Or a number of observers can use a be 
liavioral scale in an actual situation and then check on whether 
they can get the data needed and whether they agree on the way 
m which they recorded the data When this type of procedure is 
used, It soon becomes evident whether or not the question or device 
IS bringing out the kind of material which can be reliably coded 
IS satisfying the variable This, of course, is not a true validity 
check, but it creates a presumption of validity and moves a long way 
toward developing the sort of measures necessary Cor the study 
And the major codes for caiegoiizing the interview and observa 
tional materials should be worked out during this pretesting stage 
The final coding process should produce only minor additions to 
and revisions of these major categories 

In most surveys and field studies, the time and attention given 
to this form of pretesting are inadequate And yet to the extent 
that we are seeking to discover relationships there is no point of 
greater critical significance than the translation of the research 
objectives into operational measures The most brilliant theory will 
yield no results and the most refined statistical analysis will be a 
waste ol time iT we fail to develop the operational measures of the 
variables with which we are concerned Unfortunately, it is too 
often true that less time and effort are spent upon this problem 
than upon almost any other phase of the research process 

It IS important to carry out the pretests of measures and jiro 
cetlures upon a population as similar as possible to the people who 
will be studied If the field study is directed at a large gioup or 
community, n is possible to pretest the instruments without unduly 
influencing the results of the larger study to come It is advisable. 
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measurement o£ an entire group as possible and then passing on to 
another group This is also the more economical method But the 
effects of events in the ivorld suggest that do not interview all 
of one type of respondent at a single time period The difference 
between groups may then be a difference between tlie people before 
and after an event This conflict can be solved by concentrating 
heavily upon one group at a time but providing for small sub 
samples of all groups for the major time periods of the study, even 
if it means some reobservation of the first people studied 

The field study must face the important problems of obtaining 
cooperation from the many individuals and subgroups in the struc 
lure under investigation In a survey, ii is generally not necessary 
to give much time or effort to obtaining the cooperation of re 
spondents since most individuals will give an hour or more to the 
interviewer, if he will come at a time convenient to them But in 
a field study where the research workers may spend weeks and even 
months in the same community, wliere they may remterview the same 
people, seek access to privileged information, and attend all types 
of meetings, the matter of obtaining community or group support 
takes on unusual importance (10) The following procedures are 
recommended as deserving consideration 

1 There is a real economy in going to the very top of the 

structures under study to obtain the cooperation of the ranking 
leaders This is especially important if we are dealing with a 
hierarchical structure, as in industry, where lower levels are always 
dependent upon their superiors and arc too insecure to risk wel 
cwmvTig TOvest-igatcrts from xht trensvde very that the 

matter of cooperation will be referred up the line sooner or later, 
and there is a much better chance for a favorable decision if top 
leaders are consulted at the outset In addition, the top leaden are 
often more likely to undentand what the research is all about and 
are more open to conviction The greatest resistance is frequently 
found among the pett) officials, not only because of their inscainiy 
but because of their general limitations 

2 The easiest entrance into a community or a social siniciurc— 
namely, coming in as the ally of tndividiiih or groups v\ho have a 
special interest to exploit and who see the rcscardi as a means to 
their ends»shouId be avoided Thus the partisans of a specific 
reform m a community or the exeemive vice president of an indus 
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remain sohed during actual field operations The content ol i 
research instrument, tvell pretested, takes on a new meaning he 
cause of us relation to events which ha\e suddenly changed ihe 
scales developed in advance for behavioral observation prove useless 
for bringing out the critical points m the group meetings under 
study It IS well, therefore, to have a research team in the field 
which has some balance between ingenuity and soundness of judg 
ment It is often necessary to improvise and make changes m pro 
cedures during the course of the field work But if ingenuity in 
meeting new situations runs unchecked, the research team will lose 
sight of the original research objectives In the actual field situation, 
the pressure is strong to meet the practical needs of the moment at 
the sacrifice of long range research plans 

The skills and personnel for the field operation differ con 
sidcrably from the requirements of a large scale survey Tlie tasks 
for the field worker are more varied and often more difficult than 
for the production interviewers m the survey Not only must the 
field worker be able to enlist the cooperation of all groups in the 
community but much more of his interviewing must be with top 
leaders and ranking officials (10) In the Human Relations Progiani 
of the Survey Research Center, m which industrial and govern 
menial organiiations were studied, it was found that the inter 
viewing staff for national surveys was not equipped to deal with 
the leadership levels of importance to the investigation Field 
workers needed some experience with and knowledge of organiza 
tional-and administrative problems Individuals who had had con 
siderable graduate training in psychology and the social sciences 
\ and considerable experience m industry and government proved 
to be the most effective workers Moreover, field studies of this sort 
need to differentiate in tjie skills of their field workers and to 
include a few people who are capable of meeting outstanding 
leaders at something approaching their own level 

Since most field studies extend in time, controls should be set 
up to ensure comparability of the information obtained during 
different periods of the study People interviewed at a later period 
m the study may have been affected by natural social events orcui 
ring since the start of the study or by reports from and interaction 
with earlier subjects These two types of effects call for different 
procedures The effects from the study iiself suggest as rapid n 
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It should be recognized that any field study entails some risks 
The introduction of observers prying into various aspects of com 
munity life may not be relished by all groups in the community 
Certain questions may be offensive to some people And at times the 
climate of opinion may be such that people will cooperate neither 
in launching the study nor in allowing themselves to be interviewed 
The risks should be known from the scouting explorations and 
calculated in advance In general, however, these risks are exag 
gerated in demoCTatic societies People in general like to be inter 
viewed and are receptive to the idea of cooperating with social 
scientists The general experience is that the usual practical ob 
Stacies in getting information which obsess the armchair critic are 
not so difficult to overcome as the problems of research design 
4 Another aspect of long run cooperation is the ethical stand 
ard of the research worker in keeping faith with people who have 
helped him This involves a religious preservation of the anonymity 
of the respondent and a thorough carrying out of the spirit and 
letter of the obligations incurred m the study Every provision 
should be made to protect the idenuiy of the respondent Analysis 
of materials should not be earned to such a point and case mate 
rials should not be quoted in such a fashion as to permit identi 
fication of specific individuals If people are told that the identifying 
marks on their questionnaires will be removed, these marks should 
be removed Research workers have rightfully made a fetish of this 
preservation of the anonymity of respondents, and the assumption is 
that the rule holds save where there is explicit prior understanding 
and willingness on the part of the subject to become known 

Another, more subtle problem concerns the specificity uith 
which findings are reported for particular groups and subgroups 
Even if individuals are not identified, the reporting of results for 
subgroups can place them in positions of advantage or disadvantage 
Here the general rule is that the cross breaks should be reported so as 
to show the general relations rather than to show the incidence of 
types of values and behaviors in small subgroups Such general find 
mgs should be aiailable to cvciyone The researcher cannot, of 
course, control the uses to which liis results are put, but he should 
attempt to provide equal access to his general findings for all groups 



88 Research Settings 

trial company seeking information on the delinquencies of his sub 
ordinates may welcome the researcher and offer him support The 
alliance with such special pleaders is neither ethical nor wise The 
researcher s aim should be to enter the situation in the common 
interests of all parties and his findings should be equally available 
to all groups and individuals The cooperation offered a partisan has 
two disadvantages It may result in a lack of cooperation on the part 
of other people, and it may exert undue influence upon the re 
search objectives 

It may be, of course, that this ideal is too difficult to achieve 
m practice, since the researcher may sometimes have to accept help 
where he can find it But the broader the basis of his support, the 
more all groups in the community see that social science has poten 
tialities for helping all those who want to avail themselves of its 
findings the sounder our operations can be To ensure this type of 
general community support, it is often advisable to set up an 
advisory, sponsoring council representing all the diverse inteiests 
in the community 

3 As much information about the study and its basic purposes 
should be given to the people qf the community, as well as to the 
leaders, as can be revealed without prejudicing the results of the 
investigation This problem of the amount of information which it 
IS desirable to give is one of the real obstacles in field research and 
field experimentation If the study is wide in its coverage and if 
It extends over more than a few days u is impossible to maintain 
a conspiracy of silence An attempt to preserve secrecy merely in 
creases the spread and wildness of the rumors Yet the researcher 
does not want his potential subjects to know too much about his 
specific hypotheses and objectives A common solution is to present 
an explicit statement at a fairly general level with one or two exam 
pies of Items which are not crucial to the entire study 

In presenting an explanation of the study, it is well to utilize 
the accepted channels for communication in the various groups 
in the communit) If the information is limited to a single channel 
the study may become identified with the interests associated with 
that channel In industrial studies, an account of the coming research 
should appear both in the company house organ and in the union 
newspaper 
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lo the correlation measure to be cmplo)ed in that correlation 
assumes definite properties with respect to the distribution of the 
measures being correlated 

The decision about the cross breaks to be employed is already 
determined in the hypothesis testing study In the exploratory study. 
It IS wise not to try to run every \anable against every other variable 
but to decide on the basis of alt the information available what the 
most promising relationships are likely to be 

In general, material from the observations and interview 
responses in a field study or a survey lend themselves more readily 
to correlation analysis than to analysis of variance Analysis of 
variance is better suited to experimental studies where we have 
approximately equal groups assigned to known values of the experi 
mental variable The procedure of the breakdown or cross break is 
a correlational procedure, and every table reporting a cross break 
IS a correlational table Frequently, correlations are not computed, 
since the comparison of groupings within the table can show the 
relationship though the precise amount of relationship requires a 
correlational computation Although it is a great advantage to be 
able to state the degree of relationship for the entire table by a 
single correlation coefficient there is often no statistical rationale 
for such a proceduie Correlations make assumptions which the 
data ma) not justify To compare the significance of the difference 
between tv\o subgroups in the table demands fewer assumptions 
about the nature of the data 

The exploratory study calls for ex post facto analysis in which 
we want to make the most plausible interpretation of our findings 
even though the interpretation can never be definitive The usual 
procedure here is to tiy to develop a relationship or to check on its 
spuriousness by holding factors constant through the use of triple 
breaks The most ingenious and systematic use of this method is 
exemplified in the work of P Lazarsfeld who with P Kendall has 
presented a codification of such analysis procedures in Continuities 
tn Social Research (13) Three mam types of elaboration are de 
scribed which differ not so mucli in essential method as in giving 
the analyst a framework for thinking about the relationships in his 
data One type of elaboration is to control for spurious factors In 
this case, we have discovered a relationship between two variables 
but we want to make sure within the limits of our data that it is 
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The Analysis of Materials 

In a well planned study m which hypothesis testing is the 
principal objective the major codes will have been developed in 
the pretesting of instruments In an exploratory study, many of the 
codes must still be developed after or during the period of data 
collection Even though the study is exploratory it is important to 
develop codes with some degree of conceptualization rather than 
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the most meaning In tins process, the findings are used to suggest 
hypotheses, which are then examined and tested further through 
manipulation of the data The form of statistical manipulation in 
the three types is the same, but the logical and theoretical implica 
tions differ It is true, of course, that this type of hypothesis testing 
after the fact is limited by the data already gathered and is there 
fore no substitute for the controlled experiment Nevertheless, such 
analysis can greatly increase the probability of the soundness of the 
interpretation of the data 

The meaning of the data should not be speculated about when 
It is possible to test the speculations in the data themselves Often 
by cross breaks and by holding factors constant the statements which 
the researcher sets forth as interpretation can be checked against 
his own data If confirmed, it can be stated not as speculation but 
as a finding within the limitations of his study 

Field studies frequently pose the problem of the proper N to 
use for computing the statistical significance of a difference It is 
common to use the number of individinls in a subgrouping as the N 
without regard to the possibility of clustering effects Where we are 
dealing with a cluster, the proper N to use in computing the sia 
tistical significance is the number of clusters, not the number of 
individuals Suppose, for example, we are studying the effects of 
different types of adult education practices upon the acceptance 
and use of local public health clinics There are three discussion 
gioups of 12 members each in which a participation method was 
used and three groups of 50 members each in which a lecture method 
was used In comparing the effects eness of these methods, it is not 
justifiable to regard our A^s as 36 ami 150 respectively ^Vc do 
not have 36 separate and independent manifestations of the dis 
cussion method since in any one group the use of the method was 
colored by the discussion leader and the composition of the group 
Hence tlie true N here is 3 Similarly, in ilie lecture method there 
are not 150 independent effects of the method but rather three 
major applications for it It should be remembered that the prob 
able error of the difference is based upon N s which are independent 
measures of tlie effect in question 

The field study is particularly susceptible to cluster effects In 
the laboratory we can set up our experiments to guard against this 
difnculiy In surveys we can sample in random fashion to avoid 
clustering But in a community or a group under field study our 
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not a function of a third factor which would explain away the 
relationshtp Thur, a relattonsh.p which we discover between job 
satisfaction and skill level of the job may be due to the fact tha 
hiehcr skill jobs are better paid A, triple break will allow us to hold 
wages constant for different skill levels in relation to job satisfactions 
Instead of trying to explain away our findings as due to some 
spurious factor, ive may want to show that some other factor, whose 
tuning we can establish, has intervened to produce the relationship 
Thus in the studies ol the Kesearch Branch of the Army, the 
poorer^ducaied men, though in general less critical, were more 
likely to feel that they should have been deferred than were ihe 
better-educated men (13) Lazarsfeld hypoihesues that a variable 
may have intervened between the factor of education and feelings 
about deferment as the real reason for the relationship— specifically, 
the factor of relative deprivation The poorer educated men were 
supposedly coming from groups where deferments were higher than 
among the better educated If data were available on the rate of 
deferment in this study and if, on the triple break the correhtion 
\ias really with deferment rate, then this interpretation of the rela 
tionship becomes plausible The essential point in this type of 
elaboration is that the test factor can be pinned down in time to 
some actual variable that has intervened 


A third type of elaboration is specification— ihat is, trying to 
find the conditions under which a relationship will be accentuated 
We may find in an industrial study, for example, a positive but 
small relationship between "good” human relations skills of the 
foreman and the morale of his workers We have theoretical reasons 


tor expecting the relationship to be higher, and hence we seek to 
specify the conditions under which the correlation should increase 
We postulate, therefore, that in departments where the foremen 
are effective m dealing with thcir superiors, ihe morale of the men 
iNill be higher, since, in addition to understanding their men, they 
can accomplish things on their behalf By means of a triple break 
in which we compare the morale of workers under foremen wuh 
good human relations practices with the morale of workers under 
foremen less skilled, for different degrees of conditions of foreman 
effectiveness up the line, we can document our specifications about 
the nature of the relationship 

These three types of elaboration are convenient ways in which 
the researcher can utilize post hoc analysis in making his data yield 
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tact with on-going social events, can serve as a check against the 
omission of significant variables. In a long-range program of re- 
search, there should be a two-way interaction between experimen- 
tation and field studies. The findings fiom fiel^sludies may laise 
questions which need for their solution the more rigorous methods 
of experimentation. Conversely, the experimental results can furnish 
a basis for the formulation of some of the research objectives of 
the field study. If the relations which hold within the laboratory 
also prove of significance in the field situation, it is an indication 
that the laboratory has not moved in too artifactual a diiection. 

There is an important inter\'ening step, however, between the 
field study and the laboratory experiment— namely, the field experi- 
ment. The natural experiment, previously discussed, is a social 
change which takes place without any action by the social researcher. 
It just happens to be an interesting change for him to measure. 
The field experiment, however, is a social change engineered by 
the researcher. He is instrumental in efTeciing the manipulation 
of a set of, variables in a life situation. The field experiment is the 
logical connection between the field study and the laboratory ex- 
periment. It follows the logical procedures of the laboratory more 
closely, but it nevertheless deals with factors operating in the field 
situation and thus, like the field study, is more concerned with 
"global’* variables. 

Field studies can furnish the essential information which will 
make a successful field experiment possible. The most natural step 
in following up the leads from the field study is experimentation in 
the same situation, so that there can be some actual manipulation 
of the factors which appeared to be causa! determinants in the field 
study. It is to field experimentation, therefore, that the next chapter 
is devoted. 
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cases are often pocketed in homogeneous subgroups, where the 
clustering effect must be carefully considered. 


THE PLACE OF FIELD STUDIES IN 
PROGRAMMATIC RESEARCH 

In building a science of social psychology it is desirable to 
take advantage of the different settings in which our phenomena 
occur and of approaches which utilize the particular advantages of 
a gi\en setting. The field study is unique in enabling us to observe 
and measure social processes in their natural occurrence. On the one 
hand, it can give depth of understanding to survey findings. On the 
other hand, it can give to the experimenter rich insights and 
hypotheses for more rigorous experimentation and can prevent the 
laboratory from developing a system of concepts which have little 
to do with the way in w^^h n.;ople really behave. 

In dealing with so*^ in a natural setting, the field study 

must operate in an i ^ vajiables, in which 
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tact with on-going social events, can sen’e as a check against the 
omission of significant variables. In a long-range program of re- 
search, there should be a two-way interaction between experimen- 
tation and field studies. The findings fiom fiel^^sludies may laise 
questions which need for their solution the more rigorous methods 
of experimentation. Conversely, the experimental results can furnish 
a basis for the formulation of some of the research objectives of 
the field study. If the relations which hold within the laboratory 
also prove of significance in the field situation, it is an indication 
that the laboratory has not moved in too artifactual a direction. 

There is an important intervening step, however, between the 
field study and the laboratory experiment— namely, the field experi- 
ment, The natural experiment, previously discussed, is a social 
change which takes place without any action by the social researcher. 
It just happens to be an interesting change for him to measure. 
The field experiment, however, is a social change engineered by 
the researcher. He is instrumental in effecting the manipulation 
of a set of. variables in a life situation. The field experiment is the 
logical connection between the field study and the laboratory ex* 
periment. It follows the logical procedures of the laboratory more 
closely, but it nevertheless deals with factors operating in tlie fipid 
situation and thus, like the field study, is more concerned with 
“global" variables. 

Field studies can furnish the essential information which will 
make a successful field experiment possible. The most natural step 
in following up the leads from the field study is experimentation in 
the same situation, so that there can be some actual manipulation 
of the factors which appeared to be causal determinants in the field 
study. It is to field experimentation, therefore, that the next chapter 
is devoted. 
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cases are often pocketed m homogeneous subgroups where the 
clustering effect must be carefully considered 


THE PLACE OF FIELD STUDIES IN 
PROGRAMMATIC RESEARCH 

In building a science of social psychology it is desirable to 
take advantage of the different settings m which our phenomena 
occur and of approaches which utilize the particular advantages of 
a gi\en setting The field study is unique in enabling us to observe 
and measure social processes in their natural occurrence On the one 
hand it can give depth of understanding to survey findings On the 
other hand, it can give to the experimenter rich insights and 
hypotheses for more rigorous experimentation and can prevent the 
laboratory from developing a system of concepts which have little 
to do with the way in which people really behave 

In dealing with social events in a natural setting the field study 
must operate m an open system of interacting vai tables, in which 
so called alien factors may influence the outcome But the limits of 
our subject matter are as yet so poorly defined that not all psycholo 
gists vsould agree about what should be considered an alien factor 
Hence, at the present stage of our discipline, there is much to be 
gamed from working both in the laboratory, where the restricted 
model from physics can be the ideal, and m the field situation, 
where the looser model someiimcs emplryed in the biological sci 
ences is the prototvpe 

I ther way of describing the potential usefulness of field 
$ to call attention to the history of the psychology of per 
Early experimental work, in its attempt at scientific ngor^ 
the whohstic qualities of the psychological field and pro^ 
n overanal)iic fashion to work with highly artificial ele 
: consciousness The same weakness was to be observed m 
f experimental work on social processes in the laboratory 
mphasis upon social facilitation It is important to analyze 
plexities of soj^ial iweraaion and social relationships, but 
there is always the danger that our analysis may miss the significant 
functiona properties (and deal with elements which are more 
artifacts than functional eniuies Field studies, by their dose con 
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post facto experiment, in which the investigator tries to trace back- 
wards from an effect which has already occurred to its causes, (4) the 
trial-and-error experiment, which seems to refer to all sorts of trials 
by laymen of new forms of social behavior, (5) the controlled obser- 
vational study. None of these five categories describes the field 
experiment, although the major dimensions along which the cate- 
gories differ are fairly adequate for such a description. Accordingly, 
the meaning of the field experiment and its relation to other 
methods can best be clarified by considering variations of three 
aspects: the design of the research, the setting, and the purpose. 

The essential factor which distinguishes the field experiment 
from the more common “field study" is the design of the research. 
The field experiment involves the actual manipulation of conditions 
by the experimenter in order to determine causal relations, whereas 
in the field study the researcher uses the selection of subjects and 
the measurement of existing conditions in the field setting as a 
method of determining correlations. As we shall see later, one of the 
crucial methodological problems of the field experiment is devising 
ways of manipulating the independent variable. It is this difficulty 
which has led to the use of the “natural experiment,” in which the 
researcher opportunistically capitalizes upon some on-going changes 
and studies their effects in an experimental design. If these natural 
changes have already occurred by the time the social scientist arrives 
on the scene, it may still be possible to gather sufficient data after 
the fact to fill out the design of a crude ex post facto experiment. 
In the field experiment, the manipulation of the independent vari- 
able is not left to nature but is contrived, at least in part, by the 
experimenter; thus, the design is planned by him beforehand. 

The wide variety of truly experimental studies that have taken 
place in field settings vary in purpose all the way from the develop- 
ment of social-psychological theory to the immediate solution of 
some practical social problem. Sometimes both purposes arc present, 
as in the experimental forms of “action research” which have the 
dual purpose of bringing about social change and at the same tinte 
contributing to basic social science (7, 8, 27, 39). But most applied 
rcsc.'ircli has Ujc major purpose of obtaining facts and attitudes or 
evaluating methods which will be of immediate s'aluc in solving 
some specific applied problem, although tlieorctical development 
may be a minor purpose (44). Such practically oriented icscarch is 
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The desirability ol widening the scope of experimental 
social psychology through experimemaiion m "real life” settings 
has been rccognired lor some time Lewm has pointed out that 
"Although It appears to be possible to study certain problems of 
society in experimentally created, smaller laboratory groups, Me 
shall have also to develop research techniques that will permit us 
to do real experiments within existing ‘naiural’ social groups In 
my opinion, the practical and theoretical importance of these types 
o! experiments is of the first magnitude* (30, p 164) 


WHAT IS A FIELD EXPERIMENT^ 

Although considerable progress has been made in developing 
vodri Vne Yidiii experiment is not )et a ive-fl deve’ioped 

method of basic research in soaal science Rather, there is a variety 
of related methods, such as action research, evaluation research 
operational research, etc , which may include experimental studies 
m field settings In discussing current conceptions of experimental 
method m sociology. Greenwood (18. Chap 2) describes five/types 
(1) the "pure experiment, which is here called the labbraioT) 
experiment (see Chap 4), (2) the uncontrolled experiment, or what 
ue have called the natural experiment (see pp 78 79), (3) the ex 
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post facto experiment, in which the investigator tries to trace back- 
wards from an effect which has already occurred to its muses, (4) the 
trial-and-error experiment, which seems to refer to all sorts of trials 
by laymen of new forms of social behavior, (5) the controlled obser- 
vational study. None of these five categories describes the field 
experiment, although the major dimensions along which the cate- 
gories differ are fairly adequate for such a description. Accordingly, 
the meaning of the field experintent and its relation to other 
methods can best be clarified by considering variations of three 
aspects: the design of the research, the setting, and the purpose. 

The essential factor which distinguishes the field experiment 
ttom. the mate common, “field studv” the dcxigu of the research. 
The field experiment involves the actual manipulation of conditions 
by the experimenter in order to determine causal relations, whereas 
in the field study the researcher uses the selection of subjects and 
the measurement of existing conditions in the field setting as a 
method of determining correlations/ As we shall see later, one of the 
crucial methodological problems of the field experiment is devising 
ways of manipulating the independent variable. It is this difficulty 
which has led to the use of the “natural experiment," in which the 
researcher opportunistically capitalizes upon some on-going changes 
and studies their effects in an experimental design. If these natural 
changes have already occurred by the time the social scientist arrives 
on the scene, it may still be possible to gather sufficient data after 
the fact to fill out the design of a crude ex post facto experiment. 
In the field experiment, the manipulation of the independent vari- 
able is not left to nature but is contrived, at least in part, by the 
experimenter; thus, the design is planned by him beforehand. 

The wide variety of truly experimental studies that have taken 
place in field settings vary in purpose all the way from the develop- 
ment of social-psychological theory to the immediate solution of 
some prattical social problem. Sometimes both purposes are present, 
as in the experimental forms of "action research" which have the 
dual purpose of bringing about social change and at the same time 
contributing to basic social science (7, 8, 27, 39). But most applied 
research has the major purpose of obtaining facts and attitudes or 
evaluating methods which will be of immediate v'alue in solving 
some specific applied problem, although theoretical development 
may be a minor purpose (44). Such practically oriented research is 
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the most common type of field experiment For example, studies 
evaluating the relative effectiveness of two types of political propa 
eanda or of two teaching methods or of several adtertising appeals 
attempt to get fairly immediate answers to practical problems ol 
politics education, or advertising without allerapimg to apply any 
general theory In fact, there are very few field expenments with 
a strongly theoretical orientation, yet there is a sufficient stockpile 
of theory and knowledge in soaal psychology to warrant many such 
field experiments Most of all we need expenments whose purpose 
IS to test the applicability to real life situations of the known scien 
tific laws or hypotheses specifically developed m controlled labora 
lory settings Throughout this chapter we shall try to emphasize 
the scientific and theoretical purposes of the field experiment 

The setting for a field experiment is some real existing social 
situation in which the phenomena to be studied are commonly 
found By implication u is not an artificial situation created m 
a research laboratory This distinction however, is not nearly so 
clear cut as it seems at first glance for <t is not important uhether 
the social phenomena occur in a building called a laboratory rather 
than in a school or some other real social institution The relevant 


distinction here seems to be between studying real and studying 
artificial social phenomena One meaning of artificial as applied 
to the behavior of people m the laboratory seems to be that their 
behavior is determined by their role of being a subject that they 
would not act the same way if they were not m this role Block and 
Block have pointed out that the middle class subjett almost mvan 
ably structures the expenmemal situation as one calling for a sub 
missive role in relation to an authority figure (3) In so far as social 
behavior is role determined ii is clear that findings obtained with 
one role cannot be generalized to apply to other roles without addi 
tional research In addition the behavior of subjects m a laboratory 
experiment is highly restricted by the rules and procedures insti 
tuted m order to control conditions frequently, this simplification 
involves the creation of new groups which will not be influenced 
by their past history or their present social setting The laws which 
hold for such restricted situations may not apply without changes 
to the more complex settings of real life Usually a field experiment 
IS not subject to such amfiaality and thus avoids this problem 
of generalizing to real life situations That this is not always the 
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case, however, is well illustrated in the famous Hawthorne experi- 
ments (37). From a methodological point of View, the most inter- 
esting finding was what we might call the "Hawthorne effect." In 
order to manipulate more precisely the physical factors affecting 
production," the experimenters had set up a* special experimental 
room for a small group of girls who were wiring relays. This wiring 
room was separated from the rest of the factory, and the girls work- 
ing in it received special attention from both outside experimenters 
and the management of the plant. Careful studies of this wiring 
group showed marked increases in production which were related 
only to the special social position and social treatment they received. 
Thus, it was the'‘‘artificiar’ social aspects of the experimental con- 
ditions set up for measurement which produced the increases in 
group productivity. The distinctions between "arrificiar' laboratory 
experiments and "real" field experiment are, therefore, matters of 
degree; one field experiment will vary greatly from another in the 
complexity of the social setting, the controls introduced by the 
experimenter, and the roles played by the subjects. 

For the purposes of this chapter, then, we shall define a field 
experiment as a theoretically oriented research project in which the 
experimenter manipulates an independent variable in some real 
social setting in order to test some hypothesis. 


PLANNING A FIELD EXPERIMENT 

The ideaVized sequence ol steps in p’lanning a sciemiTic experi- 
ment might include, first, the selection of a problem on the basis 
of theoretical considerations and the formulations of precise hy- 
potheses, and then the selection of some appropriate methodology 
and the creation of an experimental design. No sucli simple sequence 
is usual in planning a field experiment, because the social scientist 
does not have free access to the wliole world as his laboratory or 
the power to carr>' out most of the experimental manipvilations he 
might conceive. Accordingly, he must proceed somewhat opportun- 
istically. Frequently, he is first presented with an opportunity to do 
research in some specific setting, so that the ciioicc of problem and 
method will come aftenvards. In most cases it will be wise to 
consider simultaneously the selection of a problem, the selection 
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o£ a field setting, and the designing of the study, since all three of 
these factors are highly interdependent. The kind of control groups 
that are needed, for example, will depend on the problem selected; 
and the possibility of obtaining these groups will depend on what 
is available in the setting. This interdependence should be kept in 
mind as we discuss, in sequence, selecting and formulating a research 
problem, selecting a setting, and research design. 

Selecting and Formulating a Research Problem 

SELECTING THE PROBLEM. The rcscarch problem and the re- 
search methodology should be selected for their appropriateness to 
each other. It is not easy at present to make generalizations about 
the kinds of problems for ■which the field experiment is an appro- 
priate methodology, because there has been too little field experi- 
mentation in social psychology. Yet one can see certain common 
types of problems in the work that has been done. Perhaps the 
largest category of problems for which a field experiment has been 
the appropriate methodology would be the studies on "how to do 
it.” The field experiment has been used for studying a variety of 
techniques and methods, such as advertising techniques, training 
methods (4, 19, 31, 38), the effects of group decision (28, 29) and 
participation (9), methods of group therapy (10), political propa- 
ganda (1, 20), etc. In all of these cases, one or more methods for 
bringing about a change already existed prior to the research. In 
other words, these experiments tested the effectiveness of change 
procedures which had been used for purposes other than research. 

Also, there have been several experiments which made use of 
the existing formal or informal social structure. Jackson, for exam- 
ple, studied leadership by actually exchanging foremen (21). Snyder 
made use of existing leaders and their groups in experiments on 
the influence of military leaders (40). Van Zelst studied the effects 
of sociometric regrouping on the productivity of work groups (43). 
Social structure has also been used in experiments on rumor (2). 

Field experimentation seems also to be an appropriate method 
to use with a problem or phenomenon too difficult to study in the 
laboratory-for example, the changing of food habits (28) or the 
effectiveness of political propaganda (1, 20) or "mass education” (42). 
Probably a field experiment is used in many of these cases not so 



Experiments in Field Settings 103 


much because it would be impossible to devise an experiment in 
the laboratory but because sudi artificially created phenomena 
would be too weak or would not be dynamically equivalent to the 
real thing. Thus, Annis (1) collaborated with the pripter to have 
an editorial "planted*’ in the university daily so that the subjects 
would read it with their normal set without knowing that it was 
an experimental stimulus. Had they known, it is doubtful that the 
experiment would have been d>iiamically equivalent to the usual 
political propaganda. 

Conversely, one of the more important determinants in selecting 
a problem for a field experiment is the fact that it is possible to 
study by this method only those hypotheses in which one can manip- 
ulate the independent variable or produce a change. Here it is 
most important to keep in mind the rule: “Start strong.*^ Except 
where experimental methods are already well developed, the experi- 
menter should attempt to maximize the variations in the inde- 
pendent variable or the differences among the experimental treat- 
ments. This is necessary both because our methods of measurement 
in social psychology are often so crude as to be able to reflect only 
rather gross changes and because the experimenter, at least at the 
beginning, is quite likely to produce far smaller changes than he 
attempts. There are a variety of determinants of the changes that 
can be produced, such as the role of the experimenter, his knowl- 
edge, and his skill. These will be discussed further in a later section 
of this chapter. 

This same rule would require that we study major rather than 
minor independent variables. If our dependent variable is deter- 
mined by several factors, we cannot hope to determine experi- 
mentally the effects of the minor ones until we understand (and 
can control) the major ones which account for most of the variance. 

In other cases, the field experiment seems to have been the 
preferred methodology because it reduces the problem of general- 
ization and application of results. Much of the "operational re- 
search" in the armed services and in other organizations is of this 
character. Such applied research usually evaluates the effectiveness 
of operational procedures. Many a study of college sophomores lias 
been criticized on the ground that the same results would not hold 
for soldiers or industrial workers, but such an objection cannot be 
made for a field experiment using soldiers or industrial workers as 
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subjects. In short, by study of the specific setting and problem 
which is to be solved, it is possible to avoid the problem of gen- 
eralizing from one situation to another. ^ c ^A 

On the other hand, some problems are appropriate for a field 
experiment precisely because they require generalization to a real 
situation. These are the hypotheses proved in laboratory experi- 
ments but not yet sufficiently studied in field settings. We have 
already indicated that this should be an increasing source of field 
experiments. 

Finally, of course, many problems appropriate for field experi- 
ments have come out of field studies. Often these are theoretical 
problems in which a hypothesis has been too confounded in the 
nonexperimental studies or where a correlation has been estab- 
lished without the possibility of determining the direction of causa- 
tion. The Survey Research Center at the University of Michigan 
has conducted several field studies of industrial organizations in 
which it has found that organizational units with high productivity 
have more democratic supervision. Because either could be reason* 
ably interpreted as the cause of the other, the Center put the matter 
to experimental test (34, 35). In a field experiment the locus of 
organizational control was actually manipulated toward more de- 
centtaliiation in one set of groups and toward more centralization 
in another matched set of groups. The hypothesis would predict 
increases in productivity in the first treatment and decreases in the 
second treatment. With this experimental design it is no longer 
possible to say that high productivity might be causing the demo- 
cratic supervision. 


roRMULATiNG THE PROBLEM. In Order to make a scientific con- 
tribution, the hypothesis selected for study in a field experiment 
must be statable in general terms— i.e., the variables must be con- 
ceptU3]}2cd (30, Chap, 2). This will tend to be true almost auto- 
matically for an experiment designed specifically to test some law 
which was developed in basic research. If the initial selection of a 
problem has been more practically oriented, hoivever, it will be 
important to try to conceptualize the problem or the procedutes. 
For example. Rosenberg’s first step in an experiment on the effec- " 
liveness of role playing as a method of training (38) was to try to 
develop some insight into the nature of role playing. This led to the 
identification of one major variable which seemed important in 
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accounting for the distinctive characteristics of role playing— namely, 
the degree of involvement in the role. Other characteristics were 
neglected, and attention was focused on this one variable. Accord- 
ingly, the experiment attempted to create three degrees of involve- 
ment in the role in order to test explicit hypotheses concerning the 
effect of such involvement on perception, feeling, and participation 
in discussion. 

As a problem or a social technique for bringing about a change 
becomes more conceptualized, it also becomes clearer that it involves 
several hypotheses and that tnere is a real choice as to how global 
or how analytical to be in stating a hypothesis. Whether to work 
with a global syndrome or with a narrowly defined variable poses 
a problem about which there is considerable difference of opinion 
among different disciplines of the social sciences and among dif- 
ferent theoretical approaches. We cannot discuss it fully here. 
Rather, we shall attempt to point out certain aspects especially 
relevant to the field experiment. 

Consider Fig. I as a schematic representation of the causal 
interrelations among the experimental manipulations, the inde- 
pendent variables to be manipulated, and the dependent variables 
whose changes are to be measured. In this example, our general 
hypothesis predicts that the more democratic the behavior of the 
first-line foreman, the higher the production of the employees (the 
arrows indicate the causal relations). Our independent variable of 
foreman behavior might be manipulated, for example, by training 
the foremen (A in Fig. 1). But how shall we define "democratic 
behavior” for the purposes of our training course? This pattern of 
behavior consists of many parts or dimensions which might be 
treated as separate variables; for example, the amount of freedom 
or closeness of supervision {Ig), the amount of participation per- 
mitted in planning (7^), the arbitrariness of discipline (/<.), and how 
much the foreman does to satisfy the needs of his employees (7^). 

But these aspects are interdependent so that a change in one 
will produce a change in any neighboring aspect. (In Fig. 1, low 
interdependence is represented by a heavy boundary and high inter- 
dependence is represented by a less heavy boundary' between neigh- 
boring regions.) Now, in analyzing the data from a nonexpcriniental 
field study of production as deicTmined by the behavior of the 
foreman, the investigator is free to choose any part or any combina- 
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tion o! parts ot loreman behavior as the independent variahle; this 
includes the freedom erroneously to treat /„ plus /j as a single 
variahle instead of choosing some combination more closely corre- 
sponding to the actual interdependence (for example, I, and li) or 
choosing /„ alone. In a field experiment, on the other hand, there 
is less probability of error in deciding how analytical to be in 
formulating the independent variable. 


EXPERIMENTAL INDEPENDENT 

MANIPULATION,^ VARIABLES 


DEPENDENT 

VARIABLE 



Fic. 1. The causal relations among experimental manipulations, 
independent variables, and dependent variables. 


In planning the foreman training, let us suppose that the 
experimenter tried to be too atomistic by varying alone. If, in 
fact, /, is highly interdependent with then a change in would 
produce a change in Thus, the experimenter would have actually 
varied both variables—a result which corresponds better to the real 
structure of the situation. If he has measures of all four independent 
variables, the experimenter would then notice the changes in h and 
would reformulate his independent variable in a way which would 
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yield better predictions. By attempts actually to manipulate a vari- 
able in the field setting, it seems that the social scientist has a betmr 
chance of finding out what parts necessarily hang ^ 

partly avoids the error of a too broad or too narrow definition o 
his variables. This is important, because it is one way of making our 
concepts conform to the essential nature of the phenomena we stu y. 

Fig. 1 illustrates also the advantage of the field experiment over 
the field study in determining the direction 
experiment can distinguish between indepen 
variables, whereas the field study can only establish 
among variables. In this example we have assumed that 
the level of production will also change certain paru o 
cratic behavior of the foreman-, '.e., it will reduce the 
his supervision. Since we have also assumed that c oseness 
vision causes lower production, we have here a circu ar socia p 
In order to analyte fully the nature of this 
would have to conduct two field experiments in «hich each of th 
two variables was manipulated as an indepen ent ya ‘ 
ably a good many real social processes contain 
tions of this type. It is important to analyze these p 

“■"■'rfzli., . • ‘p” r “ r SS. 

state explicitly the hypothesis to be tested^. As in , 

menu, the more this hypothesis is derivable from a g yyj 

the better. It is also important to formulate hypo‘h«« 

other possible causes of the same dependent varia 

can t^ to hold these other factors constant in the design of 

RESEARCH OBJECTIVES possi 

tion of a problem for a field expcrim , . r , 

problems will be determined partly by the re a i subjects 

purposes of the scientist to the practical P-P- ^ 
and their institutions. Sometimes th differences in goals 

goalsi often there will he differences^ - 

mean that the activities in which ^ the. practitioner as 

reach his scientific objectives may b differing purposes set 

interfering with his goals. Accordingly, rhoose any problem 

a limit on the freedom of the experimenter to choose any p 
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o! interest to him. The less interest the practitioner has in the 
development of science, the more limited will be the researcher s 

choice. , 

One templing solution to this conflict, which may frequently 
have bad effects on the research, is the attempt to study too many 
problems at the same time-thai is, the social scientist collects data 
in order to test certain theoretical hypotheses and at the same time 
tries to collect additional data to satisfy several of the practical 
needs of the organization in which he is working. This results in 
a dispersion of effort and a lowering of the quality of the research. 
Another solution which reduces the scientific value of the research 
is to compromise on the degree of generality in formulating the 
problem: for the purposes of science, maximum generality and 
abstractness of the concepts are desirable, whereas the practitioner 
often wants knowledge about his specific situation with all its 
idiosyncrasies. The best solution to this problem seems to be to 
recognize frankly the possibilities of conflict ns well as the possi* 
bililies of mutual interest and then to select for study a problem 
of maximum benefit to both the scientist and the practitioner. We 
shall have more to say about this cooperative relationship later 
(see pp. 123/.). 

Sflectivg n Setting 

tsiABLisiuNC coMAcr. UnIcss lic is already established in the 
research department of some organization, the experimenter will 
face the problem of establishing contact with appropriate field set- 
tings. Social scientists in organized rcsearcli centers will usually find 
this much easier because of the many institutional contacts which 
already exist. U the research center offers services to practitioners 
(con$uU.Tiion, training institutes and conferences, lectiues, etc.) there 
vsill he many client oi^nizaiiom with which there is good rapport. 
There may even be fretjuent requests that thcicentcr do research 
projects. 

In other cases, the scientist must take the initiative lo develop 
contacts. One lead to possible settings is an examination of the kinds 
of organizations already supporting similar research. Government 
and business organizations, for example, now support a variety of 
field projects In other organizations not supporting rcsearcli the 
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researcher will find some people who are favorably disposed toward 
social-science research and willing to cooperate. Contacts can be 
made most easily with those organirations which feel the need for 
research to help solve their own problems. Particularly for a fie 
experiment, which will entail certain new and unforseen changes 
in the organization, it is best to contact organizations where there 
is both a motivation to contribute to science and a realistic expecta- 
tion that the research will benefit the organization. This suggestion 
implies that the researcher must show his willingness to^ e p t e 
organization. No statements, however, will be as convincing as 
past services to organizations or past field studies whic ave a 
beneficial results. , . 

In selecting a setting for a field experiment, the researc er 
of course, look first for a social situation which contains the phe- 
nomena he wishes to study. This is not a simple matter of determin- 
mg the presence or absence of the phenomenon, but rat er a qii 
don of discovering whether it occurs in a sufficient V r,niiire 
strongly enough so that research is feasible. Actually t is t” J “I 
txmsiderable knowledge and understanding of t e s 
mtample, a researcher who wished to study the , . 

standards about performance in work groups as a un i 
cohesiveness of the groups selected bomber " 

the Air Force. This seemed an ideal setting because t er 
ntany identical groups. Later he discovered that 'h”' geo p 
too small, that their performance was difficult to disti g 
the work of other maintenance men. that the crews "cr . 

'act. identical, and that probably the group standards g 

^3llv WPJit • 

the influence of the experimenter. Among 
'ehicli may be relevant to the problem he sus es .' 

''■ill be important variations in the degree to w ic i i jimPcd 

has freedom and power in the setting or organization 

F'«t of all, he nLds the power to ■"-■■’tw'Ian etn"a co,.- 
'actable or to try out some technique of s experi- 

“■“'led design. The power and infiuence ccn''!c ^rimenial 

“enting will depend very much on how upsetting • P 
P'ocediire is to the organization. experimental 

hoth his power and bis skill m carry g ,„.jrc|ier’s rela- 
"catment in a field setting may depend upon the researchers 
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uonship to the practttioner If his own skills or the power of hi 
role are insufficient, the researcher may be abl- to use a skillful 
practitioner as the actual manipulator of a variable Not only does 
such a relationship increase the possible range of successful experi 
mental problems, but there is frequently a great value in separat 
mg the action role from the role of data collector For example, in 
experiments on therapy, training, and the like, there is probably 
much less bias if the researcher is not also at the same time the 
therapist or trainer Thus, it will often be desirable or even essential 
for the researcher to select a setting m which such help and coopera 
lion from a skillful practitioner are available 

It IS also desirable that the experimenter have the pouer to 
control other factors which might confound the experiment In most 
field settings this seems to be possible to only a limited degree An 
inferior substitute for actual control is knowledge, especially fore 
knowledge, about the confounding factors For example, if one wishes 
to conduct an experiment on production in a factory, it may be 
essential to prevent the factory from laying off workers in the middle 
of the experiment, yet it may be impossible for the lesearcher to 
control this However, if he can know far enough in advance of 
any impending lay-offs, he may be able to plan his experiments 
during a time when this confounding factor will not disturb th^m 
Freedom of access to the data is an obvious consideration in 


selecting a setting for a field experiment Organizations vary widely 
not only in their willingness to permit the scientist to collect data 
but also in the extent to which good quantitative data are already 
available An equally essential freedom for the scientist is the free- 
dom to publish the results of his research without undue censorship 
DiFTicut-TiEs IN THE FIELD sETTiNc Conducting field experi 
ments in areas in which there is crystallized social conflict raises 
some extremely difficult and delicate problems In order to develop 
an adequate and useful social saence, it now seems essential to make 
field studies of such conflicts This would imply the selection of 
settings in which tension and conflict are great In order to study 
problems which are not a function of conflict, however, these settings 
should be avoided, because there is likely to be an iron curtain pre 
venting study and the data which can be collected are most likely 
to be biased In a setting m which there is strong hostiJiiy between 
management and labor, gaming the cooperation of one group may 
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virtually preclude gaming cooperation from ‘j’!' 
the scientist may be forced into a position w ic _ 

•■taking sides.” and therefore acces “ maintain an 

be permitted. It seems important for , . «tudv severe 

independent role even if it means that he 

conflict situations (17, 24). It is especially true launching 

flict within an or^niration can -ke impractical 
of a field experiment which requires gr 

effort (22). , - _ KI/.ms in an organization and the 

The extent of recognized f ^efent a similar dif- 
felt need for help in solving these P™ . P . o,„nization is 
Acuity in selecting an appropriate setti g. likely that 

faced with a largfnumber of acute fight- 

its leaders will feel able or willing to . jjigculties. On the 

ing' to the investigation of the ^ ^as solved all of 

Other hand, the management which f research relationship 

its problems has little incentive to engage -vneriment requires 

of any kind" (22, p. 66). f '’8®8''’g ^ an organization 

more change than permitting a field y former 

usually must perceive the ' in research. Jaques (25). 

than in the latter case in order to c g g aspects of his work, 
who places heavy emphasis on the , j _ client requests 

studies only those problems concerning v 

help. ill the considcra* 

the need for scouTiNC. Taken tog ^ body of 

tions so far discussed imply the nee ijcncult to acquire, before 
information, some of it rather subi c ant experimenter 

a svise choice of setting can be ma .i,roucli previous field 

already has direct knowledge of the situ ^ considerable 

research in it, it svill be necessary or cxpcrimcnicr should 

amount of “scouting” (see also P* 7 ^, rs-e behavior and mter- 
visit the setting so that he may direct > j, particularly 

yiew a number of people in r .i,p experiment with those 

important to discuss their conception . . |j,e research and, « 

who will help to make the ‘decisions aoo attitudes 

possible, to talk with those who img * conllift lictueen 
toward it. Particularly svhcrc there i P acrccmcni that the 
opixising parties, it may take el tlic neettun- i'd'U'"''’ 

'Mcaicli should proceed tlmn ■' *’ 
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lion about the setting. In his intensive tlierapeutic study of a facto^, 
laques spent three months working out the arrangements for the 
study and the relations between the research team and various paru 
of the factory (24). 

EXAMPLES. A few examples will make clear the several aspects 
of field settings discussed above and will illustrate some of the roles 
and relationships that experimenters have established. We shall 
choose examples illustrating three main types of relationships— first, 
where a single experimenter is an outsider who belongs neither 
to the organization being studied nor to a research organization; sec- 
ond, where the researcher belongs to a large independent research 
organization; and, third, where he is a member of a research unit 
within the organization studied. 

Jackson's study of leadership (21) is an interesting example of 
the kind of cooperation a skillful experimenter can obtain even 
without the help of a large research organization. The experiment 
was performed as an M-A. thesis while the author was a graduate 
student. After discovering that a military setting was not appropriate 
for the design, he was introduced by his psychology professor to a 
vice-president of a company which had sent representatives to the 
professor’s training seminar. From then on, the experimenter was 
on his own in developing rapport, obtaining approval of the project, 
and carrying it out. The experimental manipulation required 
exchanging the three best and the three worst foremen, an opera- 
tion which involves more than the usual upset of the organization. 
In addition, it was necessary to prevent any further transfers of the 
six experimental foremen and their seventy men during the three 
months of the experimenL Accordingly, the experimenter stressed 
from the very beginning both the importance of these requirements 
and the fact that they might be difficult and upsetting at times. 
Then, with the help of the vit^-presidcni, a series of meetings was 
held to obtain understanding and approval by the other managers 
involved. This type of clearance throughout management (and with 
the union) resulted in enough support for the project so the manipu- 
lation was carried out and conditions were held constant with 
respect to translers, despite pressing reasons for making changes. 
The main reasons for success in obtaining such cooperation seemed 
to be the experimenter's skill in explaining the project, in conduct- 
ing meetings, and in leading group decisions, and the fact that the 
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company got as a by-product, without cost, the methods it needed 
for evaluating foremen. 

An example in which the relationship is between an inde- 
pendent research organization and a client company is foun m t le 
work of the Survey Research Center at the University of Michigan. 
This research agency had a quite specific contract with a arp 
insurance company to conduct a field experiment (see p. 1 ) ‘ ^ 

effects of the location of control and regulation (34, 35). is con 
tract specified the independent role of the research organization 
and guaranteed its right to publish the findings. Since t e rese 
team had no authority in the company, it could carry ^ „Frhf 
mental manipulation only through the help and coopera lo 
management. Such a manipulation as delegating contro a 
lation! of course, required not only a long series of 
arrange the new duties and responsibilities of 
a long series of training sessions to prepare them members 

responsibilities. In this process the research team use j 

ot the company and outside specialists. To a 
large, complex, cooperative venture could succeed 
Pany and the research . 'h f.L established and 

field survey during which discover in detail the 

the research organization had been ab exDeriment— 

initial state of the phenomena to be stu addition, 

namely, the actual locus of control an . about the 

however, they had acquired considemble ,„d 

internal relations within the organization and the 
acuteness of various problems. ^vn^rimenter and 

Quite a different relationship at 

the industrial setting was involved m som _ ^ author 
•he Harwood Manufacturing *^'’'^P'’”^'°"j^j:’cctor of personnel 
had an inside role as personnel nidation could involve 

research. Even the first studies m this ^ favorable toivard 
experiments, because the top raana^m unusual 

research and sophisticated about it « ju^torate in psychology. 

fact that the president of the corapa y „nd stronir support from 

The author's position in top erfments. There was 

'he president, permitted freedom to no E ^ during an experi- 
'“me possibility of holding conditions co effects of group 

“ent-for example, during experiments on 
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decision on the level of production it was possible to pi event super- 
visors from exerting any influence on the production of their em- 
ployees-and where it was not possible actually to control confound- 
ing factors it was usually possible to know about their e\.stence 
in advance. On the other hand, this inside position meant that the 
experimenter was subjecie4 to stronger influences of practical con- 
siderations of company goals and policies. Although this inside 
role led to belter understanding of the dynamics of the factory as 
well as inside knowledge of events before they occuned, it also 
clearly limited access to certain information. Initially, when there 
was no union, it was rather easy to obtain a wide variety of informa- 
tion about human relations in the plant by interviewing people 
at all levels; later, shortly after the plant became unionized, it was 
impossible to establish rapport with some union members and con- 
sequently to secure valid information from them. For certain kinds 
of problems, therefore, it is important for the researcher to be an 
outsider with an independent role. 

Research D«Jgn 

The problems of research design for a field experiment are in 
principle the same as they are for a laboratory experiment, but the 
difficulties in manipulating and controlling variables are often 
greater in the field setting. This gives added emphasis to some prob- 
lems of control. 

coNTOOL CROUPS. The use of adequate control groups is 
especially important in most field experiments because the real life 
setting is both complex and changing, with many uncontrolled fac- 
tors at work. Furthermore, we are frequently quite ignorant about 
these possibly confounding factors. An experiment on the effect of 
a supervisory warning ptogram upon employee Tnorale, fox example, 
might be confounded by such factors as a pay raise, changes in com- 
pany personnel policies, etc. Such a study by Hariton showed that 
there were significant changes in the direction of the training goals 
even in the control groups (19), The trained foremen were matched 
svith the control foremen on size of group, type of work, and level 
of morale, but the method of assigning subjects was by divisions 
of the company so that two divisions were placed in the experi- 
mental group and two in the control group. Later it was discovered 
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that tile major determinant of the elTcctiseness of training was the 
human relations practices of the foreman’s superior (the duision 
manager) which, of course, was systematically different for the ex 
perimental and control groups Tins is but one example of an organ 
izational factor which can easily confound an experiment since it 
affects a whole group of subjects at the same time in the same way 
Ideally the control group should be matched w ith the experimental 
group on nil possible confounding factors, matching on tlie basis 
of easily measured variables, siicli as size of group, age of subjects, 
etc , w ill do no good unless these variables are systematically related 
to the hy pothesis to be tested 

Solomon Ins shown (-ll) that the comentional design with a 


single control group is inadequate where the pre-experimental 
measures interact with the experimental treatment and influence its 
effectiveness He conducted an illusirative field experiment on the 
effects of teaching spelling in a grammar scliool and demonstrated 
that the preliminary spelling test actually interfered with the tram 
ing The preliminarv test resulted in smaller gams in the exjieri 
mental group and m the conventional control group than were 
shown by a third matched control group which did not take the 
preliminary test but did receive the training Solomon also mentions 
interaction effects and effects on variance in studies of attitude 
change Canter has shown similar interaction effects in a study of 
human relations training (5) On a prior, grounds one would sup 
pose that certain measurement procedures, such as open ended inter 
views might produce even larger interaction effects Accordingly 
It seems desirable, where possible, to use an extended control group 
design in field experiments This would involve the comparison of 

matched groups on post measures 

covraoi. THROUGH MEASURtviENT Where ihe exper,„^„,^^ 

is not able to standardize conditions the effect of confounding far 
tors may nevertheless be discovered through measurement ff 
knows all the additional determinants of Ins dependent ,ar,ab,e 

the experimenter can measure ‘'““'/"f. “"alysis 
whether any ofrthem were ' “P^'mentaf vir. 

able Thus, the field experiment vwll commonly require more meas 
urement of the total situation and partnmlarly of any uncontrolled 
~scccumngduring.hecou.se of the experiment f 

menfhas been preceded by field studies in the same se.un7and 
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on the same problem it will be possible to know moie of the possibly 
relevant factors which must be measured 

REPUC^^lON Because of the difficulty of standardizing condi 
tions and because conditions may vary over time it is desirable 
where possible to design field experiments that can be replicated 
in the same setting For example Bavelas ran a senes of experiments 
on the effects of group decision on production in the Harwood 
Manufacturing Corporation (28) These experiments took place 
over a fairU long period of time and under somewhat varying condi 
tions The present author subsequently repeated some of the same 
experiments in the same setting at a laier date (15) Such replication 
in the same setting is often difficult for group experiments because 
of an insufficient number of matched groups To replicate an 
experiment m an organization setting there must be a number of 
parallel or matched units Wheie the units or groups are not matched 
— t e where they are performing different functions or are differ 
ently oiganized— new and irrelevant variables may intrude them 
selves 

PRELIMINARY EXPERIMENTS Planning foT replication of a field 
experiment even in the initial design has the further advantage of 
providing an opportunity to perfect the experimental tnanipulation 
and the controls If replication is not planned it is all the more 
important to conduct preliromary experiments to work the bugs out 
of the experimental manipulations and to ensure that they will 
be sufficiently strong to produce measurable differences The run 
ning of preliminary experiments has not in fact been done fre 
quently enough Experience from several field experiments would 
indicate the importance of providing ample opportunity to practice 
the necessary manipulations and at the same time to provide oppor 
tunity for testing measuring instruments 

STANDARDIZATION AND INSULATION So far wc have discussed 
controlling conditions in a field experiment through the use q( 
control groups and adequate measurement We have described 
these methods first because they are the easiest rather than the best 
Actually such control does not eliminate unwanted variation 
in conditions it makes it possible merely to reduce the bad effects 
There are two ways of experimental control which do eliminate 
or reduce such variation standardization and what we shall call 
insulation 
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Sometimes it is possible for the experimenter to standardize 
conditions in the sense of holding constant something which might 
othenvise vary in the field setting In several studies of the effects 
of social variables (such as group decision) on production in a fac 
tory (30, Chap 9), it was possible to hold constant most of the 
variables affecting production-the type of machinery, the flow of 
materials, and the composition of the group (no transferring of 
employees into or out of the group was permitted) On the other 
hand, absenteeism and slight variations in the quality of materials 
could not be prevented The field experimenter should attempt 
niaximum standardization of such conditions 

Insulation is less dear as a form of control, but we shall use 
this term to refer to the elimination of certain conditions normally 
existing in the field setting In the Has^thome studies, for example, 

* e Relay Assembly Test Room was insulated from the rest of the 
I^ant both physically and socially (37) There were walls separating 
the experimental group from the other employees, and the normal 
supervisory relation was eliminated Evidently the conditions created 
were so different from the usual factory conditions that this cxpen 
tnent stands midway between a field experiment and a laboratory 
^periment A much milder degree of insulation was used in the 
^’^periments on group decision, on pacing cards, and on production 

y instructing the supervisors not to say anything to their employees 
out level of production speed of working steadiness of working 
^^0 (30 p 215) Thus insulation by eliminating some variables 
|*niplifie 5 complex conditions so that the experiment becomes more 

* c a laboratory experiment (see Chap 4) 

In Itself, such simplification is neither good nor bad rather, it 
uiust be evaluated in terms of the same kind of considerations 
Levant for choosing how analytical to be (see p 105) or whether 
Use a laboratory experiment or a field experiment 
A special problem of insulation is the possibility that one 
group may contaminate another experimental group 
the control group Communication among groups can lead to 
^'ttous confounding In an experiment on the effects of participation 
^ production, for example, competition developed between two 
groups, thus confounding the effects of participation (9) 

^ SAFrry factors Because of the danger that the cxpcnmcnial 
^'pulation may not be strong enough— in the sense that it docs 
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not produce sufficiently large differences-it is desirable to design 
the experiment with certain "saEcty factors” so that some of the 
values of a field study may be obtained even if the experimental 
manipulation faUs. For example, an experimental study of the effects 
of human-relations training obtained measures of changes in both 
group structure and group process (4). Even though these variables 
showed little change that could be attributed to the training, there 
were nevertheless some interesting findings on the interrelations 
of process variables and structure variables. 

SIZE OF UNITS AND LENGTH OF TIME. In general, it is probably 
much more effective to apply experimental treatments to units of 
fairly small size-individuals or small groups rather than large insti- 
tutions. Dealing with smaller subgroups permits easier manipulation 
and provides an opportunity for preliminary experimentation and 
replication. The size of the experimental manipulation should also 
be reduced where possible by confining it to a relatively short period 
of time. The longer the period of lime required, the greater the 
probability of unforeseen outside changes taking place. One has 
only to examine the changes over lime in existing data on such 
variables as production or absenteeism in a factory, political and 
social attitudes, etc., to see that fairly large changes can take place 
in a few months. Thus the longer experiments are more likely to 
be confounded. Finally, from the point of view of efficiency alone, 
it is desirable to reduce the time to a minimum. 


CARRYING OUT A FIELD EXPERIMENT 

It is not easy to describe the methods used and the skills 
required for carrying out the experimental manipulation and con- 
trols and for buiiding and maintaining the necessary cooperative 
relationships, because these skills are not often described in detail 
in reports of research. To generalize about these skills is even more 
difficult, for there has been little scientific study of them, and they 
are not commonly taught in courses on research methods (6). Actu- 
ally, the practical "social managers” and the professional consultants 
seem to be more skillful than the social scientists in bringing about 
a planned change in a social situation such as would be required to 
carry out an experimental design. 



Experiments In Field Settings 119 


Accordingly, this section will draw heavily on the ideas of 
Kenneth Benne and the staff of the National Training Laboratory 
in Group Development concerning the skills of the practitioner 
(36). In discussing social action, they point out, 

The behaviors of persons, groups, organizations and com- 
munities may be thought of as held in their accustomed grooves 
by an equilibrium ol forces tending to move the level of function- 
ing in one direction or another. These forces may be of various 
kinds and magnitudes—established status relations, laws, per- 
sona! and group sanctions and standards, established competence 
in a given way of working, feelings of inadequacy, non-perception 
of alternative ways of behaving, etc. Change, in the sense of an 
alteration in the level and way of functioning of a person or 
group, or an organization or community, occurs when this equi- 
librium of forces is disturbed. According to this view, change may 
be seen as the transition of a system of behavior from one 
equilibrium level to another. For example, a teacher may change 
from a manipulatory to a collaborative control relation to her 
group of students or vice versa: members of a group may change 
their participation pattern from one with a high percentage of 
individual centered roles to one with a low percentage of such 
roles, or vice versa; a community may change from a condition 
of competition between welfare agencies to a condition of 
coordinated cooperation, or vice versa; etc. . . . Planned change 
occurs when the forces holding the person or group or community 
at a given level, with respect to one or another phase of its life, 
can be assessed^ when factors making for potential diset^uilibrium 
are understood, when a new possible and desirable state of affairs 
can be projected, and when the forces for effecting movement 
to this new equilibrium can be developed or manipulated to this 
end (36. p. 108). _ 

This is also a good description of the complex process involved in 
the planned manipulation of a variable in a field experiment. 

Our conception of how to bring about these changes is thus 
actually based on; (1) some theorizing about the process of diange 
in terms of quasi-stationary equilibria (30, Chap. 9), (2) some con- 
ceptions of scientific methodology as an effective method of problem 
solving, (3) a methodological conception of democratic ethics, and 
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f4i some scientific research on the effectiveness ot certain democratic 
procedures in bringing about planned changes. \Ve shall not attempt 
to present the rationale for these foundations here but shall proceed 
directly to their implications for action. First we shall describe two 
experimental procedures and then attempt a few generalizations 
regarding the methods and the skills involved. 


Examples of Experimental Treatments 

A field experiment by Coch and French illustrates a method 
of manipulating the variable of participation and at the same time 
shows the effects of this variable in bringing about planned changes 
(9). Since both authors had conducted earlier experiments in the 
same factory, there was no problem of establishing the role of the 
researchers in the factory. Furthermore, the experiment helped 
to solve one of the most difficult postwar problems of the factory- 
Because most of the products had to be changed as the plant 
reconverted to civilian production, there were widespread tech- 
nological changes affecting many employees. Earlier studies in the 
same factory had shown that such technological changes produced 
large decreases in production as well as strong resistance to the 
changes on the part of employees. 

The experimenters wanted to test the hypothesis that the 
greater the participation on the pan of employees in planning to 
meet the changed conditions, the less would be their resistance to 
ihe^ change and their loss in production. This hypothesis was dis- 
cussed with top management during the attempts to foresee the 
problems that svould be caused by the changes and to diagnose 
their nature. Fortunately the management, including the experi- 
.menters, had experienced a considerable body of past research both 
' on the effects of job changes on morale and productivity and on 
the effects of group decision in changing productivity. On the basis 
of the previous findings and of the research-mindedness developed 
through previous participation in research, it was easy to plan 
jointly not only a procedure for handling the technological changes 
by a method of participation but also an experimental design for 
systematically varying the degree of participation. 

Three different degrees of participation were tried on matched 
groups. The “no participation/* or control, group was changed over 
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to the ne^NT job accoiding to the usual fnctory procedure, which 
consisted of explaining, in a group meeting, why the changes were 
necessary, what the new job would consist of, and what the new 
piece rate would be As usual, this procedure was carried out by 
the production manager and the time study engineer The second 
treatment, “participation through representation,'' consisted of a 
group meeting, also conducted by the production manager, in which 
the need for a change was presented as dramatically as possible and 
the broader problem of cost reduction was shared with the group 
This led to a discussion of tlic types of changes that should be made 
in the product and t!ie suggestion that management should make 
work simplihcaiion studies, tram scscral operators in the new meih 
ods, and set the piecework rates by time studies on these specially 
trained operators The new job and the new piece rate would then 
be explained to all operators, with the special operators helping 
to tram the others These special representatives were then elected 
by the group Later they met with the management to plan more 
about the new job and they presented many good suggestions 
These special operators became so involved that they referred to the 
new job as “our job' and “our rate The third experimental 
treatment, “total participation,*' was much like the second except 
that all operators participated directly in planning the new job 
and the new rates rather than participating through representatives 
Although only a few hours were involved in the meetings which 
constituted the experimental manipulations it was clear that they 
had strong and very different effects Probably this was due parilj 
to skillful leadership by the production manager and partly to the 
fact that he had an important role in the factory, which gave added 
meaning to his behavior Although no measurements of the expen 
mental treatments were made, the measured effects on production 
showed much higher productivity vsith greater participation 

This principle of participation is of fundamental importance 
in carrying out an experimental procedure The production man 
ager and other members of management were motivated to co- 
operate in the experiment because of outside economic pressures 
combined with recognized difficulties m hindling such problems in 
the past Certainly their past experiences wiih similar research on 
group decision contributed to both their understanding and their 
support 
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Quite a different problem o£ experimental manipulation faced 
the research team in a study on changing attitudes in a housing 
project (14) A previous survey showed that members of the project 
had hostile attitudes toward one another, were ashamed of living 
m the project, and had little contact with either their neighbors 
or the surrounding townspeople The experimenter wished to test 
the theory that stimulating contacts among project residents under 
favorable conditions would reduce the hostile attitudes, which, 
in turn, would reduce the shame at living in the project, which 
in turn, would cause the residents to initiate more social contacts 
with the town The people m the project, however, felt no need 
for increased contact In fact on the basis of past failures, they were 
pessimistic about community activities Furthermore, the expert 
community organizer on the research team had no clearly estab 
lished role in the project It was clear that she did not belong to the 
project but it was not clear that she was related to the university 
research team 

The community organizer began with a meeting of residents in 
which she stimulated interest m a nursery school project, a recrea 
tion program for school age children and adult education and 
recreation aciiviiies Planning committees were formed and a com 
munily wide meeting was held for further discussion of plans with 
resource experts As progress was made in organizing these activities 
during the fint month, and as more women became active, resistance 
to the activities developed, particularly on the part of the old estab 
Iished leaders They apparently perceived the new activities as giving 
support to a possibly competing set of leaders Eventually, a hostile 
rumor branding the acuvitics as Communistic forced the suspension 
of all community activities for two weeks until the rumor could be 
combaied by she preffmatwa oi tnloratsitKm abaac the 

sponsonhip and purposes of the research program, by deliberate 
efforts to integrate the old leaders into the new activities and by 
statemenu by the local project manager to the effect that the rumor 
had been demonstrated to be unfounded Subsequently these and 
other community activities were start«I up again and continued over 
a total period oE eight months (The detailed record of this experi 
menul procedure can be found in 14 pp 28-44 ) This experiment 
iltusirates two errors which ss ere later corrected but which probably 
could have been avoided m the first place if foresight had been 
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as good as hindsight. First, the role of the community organizer 
should have been structured more clearly for the residents; secondly, 
the hostility of the community leaders should have been avoided by 
not appearing to support the competing leaders. 

Cooperative Relations 

We have already indicated (see p. 108) that cooperative rela- 
tions must start in the planning phases of the project in terms of 
selecting a setting, a problem, and a sponsoring organization in 
which the possibilities for mutual benefits from the research are 
maximal. Lippitt (81) summarizes the reasons for successful coopera- 
tion on a training workshop in mtercultural relations as follows: 

It seems probable in retrospect that this collaboration of 
social practitioners and social scientists was successfully initiated 
and carried through to fruition because: 

The social praaitioners— 

—felt keenly the inadequacy of present accomplishments in 
community improvement of intergroup relations. 

—were deeply sincere in their desires to achieve results rather 
than just to receive recognition for noticeable efforts. 

—believed in the potentialities for constructive change in the 
intergroup relations of conflicting groups in the community. 

—had a hunch that the methods for bringing about these 
changes might be more efficiently discovered by the application 
of scientific methodology than by the usual tn'al-and-error efforts 
of experience, ungulded by systematic fact finding. 

-were ready to'have faith that this particular group o*t social 
scientists was sensitive to the requirements of an action job. -{(.e., 
running a workshop) as well as being adequate research tech- 
nicians. 

—perceived that these social scientists had strong personal 
motivations to do something about intergroup relations in addi- 
tion to a need for more scientific understanding. ^ ^ 

The social scientists— 
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-had as a pari of iheir research team personnel skilled in 
group action techniques as well as measurement methods 

-had arrived at the conclusion that experimental method 
ology could be successfully applied to the study of the life of 
small and large groups in the democratic community 

—had strong feelings that inicrcultural and interracial con 
flict besides being a challenging scientific problem was a 
priority social issue 

This desCTipUon fits Deutschs theory of cooperation (11) 
Establishing a cooperative situation means creating a condition 
such that if one party achieves his purpose or goal, the other party 
at the same time achieves or is brought closer to his goal Conversely. 
It IS important to avoid a competitive situation, in which locomotion 
of one party toward his goal will move the other party further 
from his goal Deutsch's research shows that, given a basic coopera 
live rather than a competitive situation, one can expect mutual 
helpfulness, better understanding of one another's communication, 
improved coordination of efforts, and other forms of behavior com 
monly considered cooperative (12) 

The longer and more complex the experimental treatment, the 
more important u is to think m terms of momtoming the coopera 
live relationship by a constant awareness of one s own goals and 
motivations and how they are related to those of others involved in 
the experiment Probably most of the early resistances encountered 
m trying to carry out the experimental procedures in the housing 
study stemmed from the fact that psychologically there was no 
cooperative situation for the members of the housing project As 
the perception of a cooperative relation grew in more and more 
residents, these resistances seemed to decrease Thus we can see that 
establishing a cooperative relationship may be a gradual process 
which has to extend to more and more of the client organization 
during ihc coune of the field experiment Two prescriptions are 
implied for the experimenter (1) try to clarity for all people con 
cemed just what you are going to be doing when and with whom 
(2) provide enough time, enough contact, and enough communica 
uon so tliai these people can ‘ have confidence m ‘ the experimenter, 
in the sense that they believe that he will do nothing to harm them 



Experiments in Field Settings 125 


or to conflict Avith their interests and, indeed, iliai he has some posi- 
tive concern for their welfare. 

Tlie social scientist may be able to rely on the skill of an experi- 
enced practitioner, and cooperation with him will sometimes lead 
to the solution of the difficult problem of carrying out the experi- 
niental manipulation. Tiiis solution was successful in the experi- 
ment on participation because it was possible to obtain the intelli- 
gent cooperation of the production manager, who had the necessary 
skills to conduct the meetings successfully and a position of author- 
ity in the company. Many of the problems which have been attacked 
through field experiments have, in fact, been problems related to 
mther highly developed professional skills in such areas as human- 
relations training, therapy, community organization, etc. For such 
problems there are already available highly skilled professional 
people who may be called upon to conduct the manipulation. The 
housing study is a case in point where a professional community 
organizer was employed as part of the research team. Ezriel points 
out that even so complex a skill of the psychoanalyst as the making 
of an interpretation in terms of unconscious impulses and fantasy- 
<letennined reasons can be used to conduct replicable field experi- 
ments within (he psychoanalytical session (IS). 


Executing the Experimental Treatment 

Once the basic role relationships arc worked out, there should 
he a collaborative diagnosis of the situation by the researcher and 
at least some part of the client organization. The purpose of this 
tliagnosis is to assess the various factors that will be involved in 
executing the research design— the resistances that may be encoun- 
tered. the dynamics of the situation in regard to the problem of 
bringing about a change, etc. The ways of going about such a col- 
laborative diagnosis will vary tremendously, depending upon the 
problem to be studied, the setting, etc. In the example on partjcjpa- 
it was possible for all those involved in the diagnosis to rely 
both an extensive practical experience in the situation and a 
<^onsiderable amount of relevant research data. Of course, all such 
relevant experience and information should be used; but whatever 
'he source, a useful diagnosis will have to be formulated in theoreti- 
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cal terms. Because it is necessary to predict reactions and to bring 
about planned changes, the diagnosis must go beyond the 
statement of facts or the labeling of things as “good” or “bad \ it 
must move toward more causal thinking, for it is only by manipu* 
lating its causes that we can vary our dependent •variable. Getting 
help from the practitioner is both a way for the scientist to increase 
his own diagnostic sensitivity and a method for maintaining the 
“involvement” and interest of the practitioner (22). 

The next step should be joint planning, based on an adequate 
diagnosis, of the actions that must be taken in order to manipulate 
the independent variable and to control other possibly confounding 
factors. To be most effective, this collaborative planning must go 
beyond the point of simply explaining to the client organization 
what is requited in the research. It should also include joint deci* 
sion-making concerning the methods and techniques by which the 
design is to be carried out. It is desirable that the practitioners be 
enough involved in the research design so that they will fully 
understand not only what it is but some of the reasons lying be- 
hind it. 

In effect, this usually means that the experimenter has to train 
some parts of the client organization in research methods. Many 
practitioners, for example, will have to learn more about scientific 
method before they are convinced of the necessity or the desirability 
of having a control group in a field experiment. A common reaction 
of management to a new selection device which looks promising 
is to want to employ it in selecting all new employees, whereas it will 
be necessary to continue selecting some employees on the old basis 
m order to evaluate the effectiveness of the new procedures. Here 
It IS quite important to take time for the practitioner to learn why 
the control groups are necessary and how an increase in scientific 
understanding can contribute to his own practical objectives. Sev- 
eral experiments, including our example on the effects of participa- 
tion, indicate that the more persons in the client organization who 
can be involved in the planning, the more widespread will be the 
support m carrying out the experiment. 

In field experiments in social psychology, one special problem 
ot collaborative planning with the client organization frequently 
faces the experimenter— namely, the problem of secrecy. In order 
to carry out certain experiments, it may be necessary, at least for a 
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certain period of time, that all the subjects in the experiment, or 
possibly all of the people in the client organization, be prevented 
from learning fully the purpose of the experiment or some aspects of 
the procedure. Such secrecy is necessary if knowledge of the hy- 
potheses would confound the results of the experiment. Of course, 
it is possible and desirable to give a full explanation at the conclu- 
sion of the experiment, but even where this is done the temporary 
secrecy can be a serious disturbance to good relations. In the housing 
study, for example, the experimenter was testing the hypothesis that 
improved interpersonal relationships within the housing project 
stimulated by the community activities would result in spontaneous 
initiation of communication between the housing project and the 
surrounding town (14). Accordingly, it was essential that the re- 
search team, while stimulating community activities within the 
project, should not also stimulate activities between the project and 
the townspeople. But it was also essential for adequate control that 
the subjects in the project not know about this reason for the experi- 
ment for fear they might initiate contact with the townspeople 
simply to please the experimenter. If this need for temporary seaecy 
is faced frankly in the early phases of planning for the experiment, 
and if it becomes a part of the training of the client in research 
methods, then it need not be a factor disturbing the cooperative 


relationship. ^ , 

In planning techniques for the manipulation of variables, sim- 
Mssful strategy and tactics for changing the variable must be MsM 
on a correct and adequate theory of change. For cxamp e, i 
experiment on the relations between the housing P™J” ... , 

surrounding town, the manipulation of the o 

interpersonal relations within the project was ase o .. 

diagnosis that hostile interpersonal relations “is ^ I 

because of autistic hostility and on the specific d-eo^ 
hostile relations could be changed by stimulating wmmun^J^on 
among membeis through any successful 8''°''? “ ‘ they were based 
« these theories and iL assumptions upon ’'bteh ,^^ were^b^_ 

were incorrect, then the actual triable of Lstilliy 

hes m the project would not in ®^tf„efore, improvements 

f I interpersonal relations. In *e long ' successful 

*n our theory of change will incressc X 

manipulations. 
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Lewm (SO, p 195) has pointed out the importance of dealing 
with the total situation when manipulating a variable 

To vary a social phenomenon experimentally ihe expert 
menter has to take hold of all essential factors even if he is not 
yet able to analyze them satisfactorily A major omission or mis 
judgment on this point makes the experiment fail In social 
research the experimenter has to take into consideration such 
facton as the personality of individual members the group struc 
ture, ideology and cultural values and economic factors Group 
experimentation is a form of social management To be successful 
It, like social management has to take into account all of the 
various factors that happen to be important for the case in hand 
Experimentation v/ith groups will therefore lead to a natural 
integration of the social sciences and u will force the social 
scientist to recognize as reality the totality of factors which deter 
mine group life 


For experiments which are more oriented to the testing of 
theoretical hypotheses it is usually more dilhcult to use skilled prac 
litioners The manipulatton will more frequently be one that has 
not been done before, or there will not be any large body of practical 
experience, thus, it will more often be necessary for the expert 
menter to conduct the manipulation himself In these field expert 
ments, where the problem under investigation is actually some 
t eoreiiml hypothesis, it is important that the experimental manip 
u ^tinu e valid in the same sense that a measurement should be 
valia-i c , It should conform to the conceptual definitions involved 
in the hypotheses 

This IS an additional difficulty in carrying out the experimental 
'>■= for the expert 

• Wh,, “timplc, one experiments on such a question as 

“ compared to the effects of 
“Prrtntenter has great freedom in how he 
hese two manipulations, since they have such broad and 
‘T Rosenberg s (38) experiment on role 

playing studied the more specific hypothesis that • deeper’ enio 
tional and altitudinal changes are produced the more iL subject 
IS deeply involved in the process of playing the role In this case. 
It IS important that the experimental manipulaiion actually con 
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form to the conceptual definition of degree of involvement in the 
role rather than to some other similar independent variable. The 
methods used for creating three degrees of involvement were: (I) the 
subjects actually participated in playing the key role in the socio- 
drama; (2) the subjects were instructed to identify with somebody 
else whom they observed playing the key role; (S) the subjects were 
instructed simply to observe what going on in the group where 
the sociodrama was presented. Obviously, this procedure does vary 
the degree of involvement, but this variable is not defined clearly 
enough to permit accurate judgments as to how well the manipu- 
lations fit the definition. Probably other factors were varied simul- 
taneously so that the experiment was to some extent confounded. 
For example, in one treatment, the attention of the subject was 
directed tov.’ard the key figure being observed, whereas in another 
treatment the attention of the subject Avas directed toward the group 
^ a whole. One might say that direction of attention was systemati- 
cally confounding these two degrees of involvement, thus reducing 
the validity of the manipulation. 


Problems of Measurement 

Especially where the field experiment is designed to study some 
theoretical hypothesis, it is essential for the researcher to measure 

success of the manipulation used, for it is only by relating the 
results to some known procedure that knowledge can be advanced. 
Where one is dealing with complex experimental procedures, the 
measurements themselves may be quite complex, as in t e case o 
the housing study. Even an extremely condensed reporting o t e 
experimental manipulations used in this study required fifteen 
pages. In other cases, the manipulation has been measured quanti- 
‘f'ively. as in the observational measurement of a 

t'ons of the training process in a workshop (31. C ap. ■ 

=‘nalytical experimLts, the measurement of the 

»'ation will be simpler. In Rosenberg’s (38) expenment. the mP^« 

measurement was to ask the subjects in a 

'hey had identified with the key figure as 

'he total group as instructed. The answers to q 

showed that on the whole the instructions 

were some exceptions. When these latter subjects were g 
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m the analysis of the data, it was found that the hypotheses were 
even more clearly supported than when the subjects were analyzed 
in accordance with the treatment to which they were assigned Thus, 
measuring the success of an experimental manipulation not only 
gives some indication of the degree to which it was successful but 
can also permit more valid manipulation through the elimination 
or reclassification of unsuccessful cases in subsequent analysis 

In a field expenment, most of the problems of soaal psychologi 
cal measurement are the same as they are for other types of research 
(The reader is referred to Chapters 6, 11, and 12 of this book, which 
discuss measurement ) However, a word of caution should be added 
concerning one point In most organizations certain sorts of soaal 
psychological data already exist m the form of records used for 
administrative purposes In industry, for example, one frequently 
finds records of productivity, of absenteeism of turnover, and of 
other aspects of human behavior It is tempting for the social 
psychologist to use all these data which are available with no 
expenditure of time, money, or effort on his part Yet it seems 
to be widespread experience that such administrative records are 
a snare and a delusion for research purposes First, since they are 
not designed for research purposes they do not always measure the 
exact variables demanded by ones hypotheses Thus, they tend 
to influence research away from a theoretical orientation in the 
long run, this IS probably a senous hindrance to the development 
of an adequate basic social science Secondly, such records are usually 
very disappointing with respect to accuracy For many purposes of 
the administrator, records need not be exact and most of the per 
sonnel who collect or use such records are not trained in saentific 
standards of accuracy Furthermore, the inaccuracies are frequently 
motivated and therefore concealed One experimental procedure 
for example resulted in a marked decrease in the variability of 
production which the researchers first interpreted as an indication 
of reduced tension (30 pp 217218) In a subsequent interview, how 
ever, we discovered that the employee had been turning m for the 
records only a part of her production on those days when her actual 
production was so high that she feared a cut in her piece rale This 
word of caution is not to say that the researcher should always dis 
regard records collected for administrative purposes rather he 
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should avoid being overoplimistic about their use and should use 
them only when he has an adequate check on their accuracy. 


Ethical Problems 

Field experiments involve new and more difficult problems 
of professional ethics and some more difficult than those in labora- 
tory experiments or field'studies in which no changes are introduced 
hy the experimenter. In a life setting, the experimental procedure 
is, in fact, social action, sometimes in a situation of social conflict, 
in which there are differences in values among the people involved 
(17). In such a situation it is especially important for the experi- 
menter to be guided by a code of professional ethics. No generally 
3ccepted professional ethics, particularly for this branch of research, 
has yet been developed, although the American Psychological Asso- 
ciation has been working on the formulation of such codes. Accord- 
ingly, this chapter will not attempt the difficult task; for the moment, 
each researcher must work out his own ethics. 

It should be pointed out, however, that the experimental 
methods which have been recommended here' are not ethically neu- 
tral. Maximizing the mutual benefit of the research to both the 
practitioner and the researcher, collaborative diagnosis, participa- 
tion in planning the research, open dealings with client organiza- 
tions, and educating the participant to understand the research- 
311 are specifications of a democratic ethics. These methods o ^ ca 
mg with people are based on an explicit recognition of the ultimate 
value of each person and his right of self-determination. 


A SUMMARY OF THE ROLE 
OF FIELD experiments 


The practical advantages of a field expcrlmeni are 
and simple. Anyone who wishes to tale effective 
any setting can improve upon the uncontrollc „,rxe- 

nieihods by the application of more sacntific “P'"";'" P 
'litres. Through careful measurement, belter 
or control groups, and other aspects of rmproved cxpcrmtcnial 
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design, the practical problems ot social action can be solved with 
greater certainly, with greater accuracy, and sometimes with greater 
efficiency than through common-sense trial-and-crror methods. 

The primary scientific advania^ of the field experiment is that 
it permits a more unequivocal determination of causal relations; 
it permits determining cause and effect where a field study would 
reveal only a correlation. Secondly, we have seen that the field ex- 
periment is particularly appropriate for studying methods of social 
change, social processes, and soda! influences. Thirdly, the field 
experiment, since it deals with the total life situation, is well 
adapted for studying complex syndromes and sodal processes where 
the interrelationships among several more analytical variables are 
involved. 

But the very fact that it deals with the total life situation also 
leads to one of the major limitations of the field experiment— 
namily, it is not an appropriate method for studying with analytical 
precision more specific single hypotheses. Finally, we must mention 
as a disadvantage of the’ method the difficulty in carrying it out 
because of the social skills required and the contacts necessary with 
settings which provide a good research opportunity. Even where 
the skills and opportunities are maximal, many field experiments 
necessarily involve so long a span of lime that they may be in- 
efficient. 

The optimal scientific role for field experiments is in a program 
of research in combination with other methods. They will be more 
successful if preceded by field studies which give a more extensive 
and exact knowledge of the setting and thus enable the experimenter 
to manipulate and control his variables more successfully. The 
development of basic theory will widen the horizons for field experi- 
ments which test the range of application of generalizations arrived 
Itaboraftory. More generally, to the degree that the field 
experiment attempts to test general hypotheses, it will make a 
contribution to science; othenvise it will have a more limited prac- 
tical value. 
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Laboratory Experiments 


Leon Festinger 


Empirical science in general has as its major objective the 
understanding or control o£ phenom ena as they occur in thejeal 
world^Nevcriheless, laboratory experimentation generally plays a 
significant part in the development of a science. It is important to 
have some understanding of why this should be true and of the exact 
function which laboratory experimentation should have in relation 
to the science as a whole. 

We shall, consequently, attempt to clarify two aspects of labora* 
tory experimentation— namely, what a laboratory experiment is and 
how the results of such experiments can be applied to the “real 
world. It would be relatively easy to discuss the role of laboratory 
experimentation by means of examples from the physical sciences, 
but sse shall attempt, rather, to illustrate the points to be made by 
examples from the problem area of social psychology. Although by 
doing this we may not be able to make our point as clearly as would 
oihcrtvise be the case, we hope that the discussion wdll be more 
meaningful and carry more weight if it is entirely oriented toward 
the field which is now under consideration. 
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THE NATURE OF LABORATORY 
EXPERIMENTATION 


Whal Constitutes a Laboratory Experiment 
in Social Psychology? 


A laboratory experiment may be defined as one in which the 
investigator creates a^Miuation with tlie exact conditions he wants 
to~hav ^^j id~in3^Kich he controls some, and manipulates other, 
variables. He is then able to observe and measure the effect of the 
msthipulation of the independent variables on the dependent vari- 
ables in a situation in which the operation of other relevant factors 
is held to a minimum. Such a definition is, however, a great over- 
simplification. Given the techniques of exoerim^'ntation today avail- 
ablCf rough approximation 

of tl, .finition. As better tech- 

niqu jr laboratory experiments 

will,! t, however, we must include 

unde ^ ..meni" a wide range of studies 

. . ' irt ^ ° 

with ^ « « £ «nd p recisio n. 

\ S p oy means of examples, to distinguish 

betwe ^ .jrly be called “field experiments” and 

“laboi 5» in many cases, of course, the distinction 

is clea s <txc: in other cases it is diHicult to maintain. 

In ger^ be guided by the two parts of our definition: 

whethi there was an attempt to create a specially suited 

vVve in iVi^ tomroi and ■manipnVa- 

tion of dbles. 

It juld seem clear that experiments in industry such as have 
been described in the preceding chapter should not be called lab- 
oratory experiments. There is little or no attempt to set up spedal 
conditions. Topically, the situation is accepted as it is found and 
some manipulation is imposed. The manipulation of the independ- 
ent variable is usually a simultaneous manipulation of a set of 
factors. The degree of control obtained in these experiments is 
usually not sufficient to guarantee that the effects obsened arc 
unequivocally related to the manipulation of the independent 
tariable. 
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Let us compare such field experiments with the Lewin, Lippitt, 
and White study (21) on autocratic aj^d demoCTatlc atmospheres. 
This was a relatively early ’orp^iment in social psychology and 
is perhaps close to the boundary between laboratory and field 
experiments. In this study a number of boys’ clubs were set up 
for the express purpose of performing the experiment. There was 
no real-life situation which was taken as given. Rather, a special 
set of circumstances was created because it was felt that the situ- 
ation thus achieved would be an appropriate one for the study of 
the variables in which the experimenters were interested. In this 
sense it should properly be called a laboratory experiment, although 
its precision is perhaps not very much greater than the precision 
of an experiment in industry, sucli as the one reported by Coch 
and French (7). 

In the Lewin, Lippitt, and While experiment, the manipulation 
of the independent variables consisted in having one leader of a 
bo)s‘ club behave in a certain prescribed manner as compared to 
another leader of another club who behaved quite difFerently. These 
two sets of behavior, which produced measurable differences in 
the behavior of the club members, were complex and differed in 
many dimensions. The experimenters were undoubtedly not clear 
about all aspect of the differences created. Thus, rather than iso- 
lating and precisely manipulating a single variable or small set of 
variables, the experimenters attempted a large and complex manip- 
ulation. There was also little attempt at control in setting up the 
dubs. In terms of the control achieved and the degree of refinement 
m manipulation of the independent variables, this study is probably 
indistinguishable from most field experiments. 

We shall now consider, as an example of a laboratory experi- 
ment with a relatively high degree of control and precision, an 
experiment by Feslinger (10) on voting behavior. In this experiment 
an attempt was made to vary a single factor-namely, whether or 
not the subjecu knew the religious affiliation of the other members 
of the group. Groups were set up for the express purpose of die 
experiment, with care taken to ensure that every member of the 
group vtzs muially a stranger to every other member. Exactly 
comparable conditions were created for each group. The nominees 
for whom subjects voted were alwaj-s paid panidpants whose 
behavior was standardized. These same paid participants identified 
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themselves as having different religions in the different experimental 
groups, thus controlling for a wide variety of personality factors and 
first impressions. 

In such an experiment, we can be more certain than we can in 
a field experiment that the results obtained are due directly to the 
variable manipulated by the experimenter. It is probable that a 
variable such as “whether or not the subjects know the religious 
affiliation of the other members'* is still not a fine or precise factor; 
it is probably, once more, a cluster of factors. A laboratory experi- 
ment should, however, attempt to refine the manipulations as much 
as the present state of knowledge permits. One of the marks of 
progress in a science is the extent to which such laboratory manip- 
ulation can be refined and specified. 

There is frequently a tendency in social psychology to criticize 
laboratory experiments because of their "artificiality.'' A word must 
be "^d about this criticism, because it probably stems from an 
inaccurate understanding of the purposes of a laboratory experiment. 
A laboratory experiment need not, and should not, be an attempt 
to duplicate a real-life situation. If one wanted to study something 
in a real-life situation, it would be rather foolish to go to the trouble 
of setting up a laboratory experiment duplicating the reaMifc con- 
dition. ’IVhy not simply go directly to the real-life situation and 
cstudy it? The laboratory experiment should be an attempt to create 
a situation in which the operation of variables will be clearly seen 
under special identified and defined conditions. It matters not 
whether such a situation would ever be encountered in real life. 
In most laboratory experiments such a situation would certainU 
never be encountered in real life. In the laboratory, however, we 
can find out exactly how a certain variable affects behavior or atii- 
luclcs under special, or "pure," conditions. 

This is certainly not the end of the task. One must also find out 
how these variables interact with other variables. The possibilit) 
of application to a real-life situation arises when one knows enough 
about these relationships to be able to make predictions concerning 
a real-life situation after measurement and diagnosis of the state 
of affairs there. 
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The R elationship Between Laboratory Espeumentalton 
and the Study of Real life Situat ions 

In the conducting of research, there should be an active inierre 
lation between laboratory experimentation and the study of real life 
situations It is relatively rare in social psychology that hypotheses 
hunches, and recognition of important variables emerge initially 
from the laboratory, most often they arise in either the formal or 
the informal study of real life situations In studying real life situ 
ations, we are forced to deal with the factors and variables as they 
exist m all their complexity Because of this complexity and lack 
of control, it is rather rare that definitive conclusions and une 
quivocal interpretations are reached in such studies, but frequently 
new variables and new hypotheses are brought to our attention 
One can take these suggestions hypotheses, and hunches and use 
laboratory experimentation to verify, elaborate, and make more 
secure the theoretical basis for the empirical results which have been 
obtained 

In the laboratory experiment, sufficient control can be achieved 
to obtain definitive answers, and systematic variation of different 
factors 18 possible As a result of this gr eater j:ontrol, precision, and 
manipulability, conclusive answers can be obiam^ anH refatively 
precise and subtle theorencal points can be tested For example, 
in a study of the spread of a rumor in a community (11) it was 
found that the more friends people had, the more likely they wpie 
lo have heard the rumor This finding may suggest the hypothesis 
that friendship reduces restraints against communication of various 
types of content or it may suggest the hypothesis that the existence 
of a friendship makes for an actise pressure to communicate or it 
may suggest the hypothesis that those who have more friends see 
more people and spend more time with these people and consc 
quenily are more likely to have an opportunity to hear the rumor 
In a laboratory experiment it would be possible to set up a situation 
in which one could, uiih a high degree of rigor, collect data which 
would enable one to choose among these possible interpretations 
One could, for example, form groups of strangers and fnends^mixed 
together m which the amount of contact among members and the 
opportunity for communication among them were experimentally 
held constant The results would enable one to say ishether the 
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effect of friendship existed in the absence of differential amounts 
of contact It would enable one to accept or reject the third hvpoih 
esis stated above In other groups one could experimentally vary 
the accessibility of other members for communication to obtain 
evidence as to whether the friendship represented a decrement in 
restraint against communication or whether there were actual pres 
sures to communicate in the specific direction of friends 

Such an experiment would undoubtedly be difficult to set up 
but, since the major body of this chapter will be devoted to the 
discussion of how to perform such experiments and how to produce 
the desired conditions, we shall not, at the moment, go into the 
details of how it might be done Let it suffice now to say that in 
the laboratory, by setting up an artificial situation, we should be 
able to verify, elaborate and refine our knowledge so as to increase 
our understanding of important processes m social life It should 
be stressed again, however, that the problem of application of the 
results of such laboratory cxpenments to the real life situation is 
not solved by a simple extension of the result Such application 
requires additional experimentation and study It is undoubtedly 
important that the results of laboratory expenmenis be tested our 
in real life situations Unless this is done the danger of running 
dry’ or hitting a dead end is alvNa)s present A continuous inter 
play between laboratory experiments and studies of real life situ 
ations should provide proper perspective, for the results obtained 
should continually supply new hypotheses for building the theo 
retical structure and should represent progress m the solution of 
the problems of application and generalization 


Difficulties of Performing Laboiatory Experiments 

Laboratory experiments, however, do not represent an cas) 
road to the collection of data for the resolution of theoretical 
problems In social psjchology thc> are tjpinlly difficult to do 
and many dangers arc present in their execution It is cxircmelv 
difficult to create in the laboratory forces strong enough for results 
to be measurable In the most cxccHcnily done laboratory expen 
ment, the strength to v^hich different variables can be produced 
is cxtrcmelj weak compared to the strength vvjtli ulnch these van 
ables exist and operate in real life situations One is able to obtain 
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results and to see clearly how these variables operate, m spite of 
this weakness, because of the increased control one has m the 
laboratory situation But it is always possible, even probable, that 
the factors will be so weak that no diHerences between conditions 
experimentally created are apparent in spite of the increased con 
trol Thus, in the setting up of a laboratory experiment, especial 
care must be taken to make the variables as strong as one possibly 
can Unfoitunatcly, one can determine whether or not one has 
succeeded only alter the experiment is over An exception to this 
generahration about the weakness of laboratory manipulation can 
be seen in Asch s use of the announc«l perceptions of group mem 
bers (2) This involved, however, the use of seven confederates for a 
single experimental subject 

Related to the problem of the strength of forces m the labora 
tory situation is the difficulty of manipulating several variables 
simultaneously In the complex field of research with which wc are 
here concerned, it is frequently ilieoretically important to see the 
effect of the simultaneous operation of two or more variables 
Unfortunatel), however, the more variables the experimenter at 
tempts to manipulate, the lower will be the strength of each 
variable This is especially true if the manipulation of the variable 
IS to be done by means of verbal msiructions to the subjects The 
result of this is at least at the present stage of technical develop 
ment, that the number of variables which it is possible to manipu 
late simultaneously in the laboratory is relatively restricted This 
will undoubtedly become less true as more powerful techniques of 
manipulating variables in the laboratory are developed 

These difficulties have an important implication for the '•on 
elusions one can draw from the results of laboratory experiments 
As in any study, it is possible that the experimenter is dealing with 
exuireJy v arrabfes— t/iai is, there may actuaffy be no refa 

tionship among the variables that are being studied Such a con 
duion would result m negative results— that is, no differences 
between experimental and control groups However, v\e should 
also find a lack of differences between experimental and control 
conditions if our experimental manipulations were not sufficiently 
strong to reveal measurable differences even though such differences 
really exist Thus, negative results from a laboratory experiment 
can mean very little indeed If we obtain positive results— that is. 
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demonstrably significant differences among conditions—we can be 
relatively certain concerning our interpretation and conclusion 
from the experiment. If, however, no differences emerge, we can 
generally reach no definitive conclusion unless we are quite certain 
that the manipulation of variables in the experiment was done 
successfully and adequately. At the present stage of technical devel- 
opment, it is seldom that we can be certain, in the absence of 
positive results, that our manipulations were adequate. Undoubt- 
edly, as more and more experiments are done, good evidence will 
become available for believing that a certain manipulation is an 
adequate one, and then negative results can be interpreted as 
demonstrating no relationship. At the present time, however, it is all 
too easy to set up a laboratory experiment which, because of the in- 
effective manipulation of variables, will show no differences among 
conditions. It should be stressed again that, at the present stage of 
technical development, negative results perhaps reveal only the fact 
that the experiment was not set up carefully and that the experi- 
menter's attempted manipulation 'of the variables was ineffective. 

Keeping in mind these difficulties and the relationship which 
must exist between laboratory and field investigation, we shall now 
proceed to a more detailed examination of Iiow laboratory experi- 
ments can be performed. 


THE DESIGN OF LABORATORY 
EXPERIMENTS 

The firsc and feremese mjarremenf for a saoxssftxl labaratacy 
experiment is that the problem be stated in experimental terms. 
This means, essentially, that there must be a high degree of speci- 
ficity and clarity in the statement of die problem and in the defi- 
nition of the variables involved. The foregoing implies that before 
one can successfully do a laboratory experiment, one must already 
know quite a bit about the phenomena one is investigating. 

The process of specifying and clarifying the statement of a 
problem so that it is amenable to experimental treatment is by 
no means a simple or easy one. Let us take an example to illustrate 
tlie kinds of problems whicli confront ilie experimenter at this 
stage. In a field study of transmission of a rumor in an organiza- 
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tion (4), It was observed that communication tended to be directed 
upward m the organizational hierarchy This result was explained 
as depending upon forces acting on members to move upward m 
the organization i c , the upward communication represented sub 
stitute movement on the part of the members 

Kelley (19) set out to perform a laboratory experiment to test 
this hypothesis more tlioioughly At this point the statement of h»s 
problem might have been What direction does communication 
tend to talce in a structured hierarchy? This siaiemeni, however, 

IS still much too general and vague for the purposes of an expen 
ment An attempt to think m terms of setting up an experiment 
makes it immediately dear that one must answer questions such as 
What exactly is a hierarchy? and Exactly what kinds of com 
mumcation are we talking about? There are many aspects to what 
IS customarily thought of as a hierarchical structure Do superior 
levels m the hierarchy have power over subordinate levels and, 
it so, what kinds of power? Is each successive level upward in the 
hierarchy characterized by increased attractiveness of the work, or 
increased freedom of choice of what work to do, or increased im 
portance of the \sork? For the purpose of setting up a laboratory 
experiment, the theory involved and the definition of hierarchy must 
be made more specific Kelley chose to establish a hierarchy in the 
laboratory on the basis of the perceived importance of the job to 
the subjects, holding the actual attracviveness of the job and the 
exact work that was done constant lor both levels in the hierarchy 
Let us now consider the question of what kind of communi 
cation would be expected to go upward in such a hierarchy It was 
dear that a distinction had to be made between work oriented 
communication communication of ciiticism, communication of 
information and communication which nas irrelevant to the task 
It wav oawgory cJl tViav. 

the effect of substitute movement v\ould be expected to appear 
Consequcnilj the experiment was set up to al’ow and, in fact to 
encourage communication of irrelevant content The final pioblem 
m Kelley s espenment was phrased as What is the direction of 
irrelevant communication content in a hierarchy based upon per 
ceived dilTercntial importance of the task? This statement was 
specific enough to permit the design of the actual experiment Tins 
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process of clarifying the objectives of tlic experiment takes consid* 
erable lime, although it may not take long to describe after it has 
once been done. 

The difficulties of designing a laboratory experiment are by 
no means overcome when the problem has been specifically defined. 
There remain the major tasks of inventing measurement devices 
and techniques for manipulation of variables which will clearly 
measure and manipulate the \'ariables which have been defined in 
the statement of the problem. No matter how specifically and clearly 
the concepts are defined in the statement of the problem, the 
laboratory experiment cannot be successful unless the measurement 
and the manipulation of variables actually relate to these defined 
concepts. 

Thus, for example, in the Kelley (19) experiment mentioned 
above, it was necessary to develop techniques for producing a 
hierarchy as defined, while other variables, such as the type of work 
done, power, and attractiveness, would be controlled. The situation 
created had to be one in which irrelevant communication would 
occur. Adequate techniques for measuring the amount and direc* 
lion of communication had to be developed. In the experiment, 
a two-level hierarchy was established^ Each level did exactly the 
same kind of work, although each was. under the impression that 
the other level was doing something different. High and low hier- 
archic perceptions were encouraged by the instructions to the sub- 
jects: one subgroup was told that its own job was the important 
one; the other subgroup was told that the job of the other level 
was the more important. Communication of irrelevant material 
was encouraged by having all communication carried on in writing 
and by injecting into the communication stream prepared fictitious 
notes which were irrelevant in their content, thus encouraging sub- 
jects to do such writing themselves. All notes were collected and 
kept, and thus analysis of the content of the communication, its 
direction, and amount was possible. 

It is rarely safe to assume beforehand that the operations used 
to manipulate variables will be successful and ivill tie in directly 
with the concept the experimenter has in mind. It is a worth-while 
precaution to check on the succks of the experimental manipula- 
tions. In the experiment by Kelley, the subjects were asked a number 
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of questions after the session was over to determine whether or not 
the manipulation of status in the lueraiJiy had been successful. 
It was found that, in terms of theii reported perception of status 
and their desire to be in the otlier role, the manipulation had cre- 
ated a difference between the two levels This difference was a 
relatively small one. however. Small differences in the results could 
be directly attributed to the small difference in perceived status. 
When the difference in perceived status was made larger by selecting 
out those subjects for whom the experimental manipulations were 
clearly successful, the results become much clearer and more con- 
clusive. If there had been no check on the success of the experi- 
mental manipulation, such analysis would have been impossible. 
It would also have been impossible to attribute unequivocally the 
inconclusiveness in the results to the relative inadequacy of the 
experimental manipulation. 

The problem of the adequacy of the manipulation of variables 
may be dealt with in part by preliminary studies. In almost any 
laboratory experiment, the initial design will have certain inade- 
quacies which will become clear after a few trial experiments. Such 
preliminary runs are also important to provide practice for the 
investigator so that his behavior and his instructions become stand- 
ardized by the time the regular experiments start. 

the execution of laboratory 

EXPERIMENTS 

Techniques of incasurcmeni, manipulation, or control of vari- 
ables can be introduced at almost any stage in the process of a 
laboratory experiment. We shall atuttapt, lo. the 
to TOver in detail most of the techniques which have been used 
fruitfully and to give examples of their successful use. 

Decisions about Subjects for the Experiment 

Decisions about the kinds of persons to be used as subjects, 
how they are to be recruited, and what they arc to be led to expect 
before they come to the experiment provide important opportuni- 
ties for the manipulation of v'ariables. 
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Controlling the Composition of the Gioup 

It IS possible to arrange the composition of the group so as to 
control the number of friends m each group or to select subjects to 
ensure that all of the members of a group are strangers to one 
another at the beginning of the experiment The decision concern 
ing the composition of the group depends of course, upon the 
purpose of the experiment and on the variables upon which the 
experimenter desires to focus his imestigation We shall give some 
examples of the introduction of an experimental control or manip 
ulation at this stage of the procedure 

The experiment by Festmger (10) previously referred to had 
as Its objective the determination of whether knowledge of religious 
affiliation in a mixed Catholic Jewish group would affect the atti 
tudes of members toward one another It was assumed that these 
attitudes would be reflected by tlicir votes in elections for officers 
of a club It was decided to have groups meet in the laboratory and 
elect officers of a club into which they formed themselves Half of 
the elections were to take place while no one in the group knew 
the religious affiliation of any one else, the other half of the clcc 
tions were to take place after the religious affiliation of each member 
was publicly announced It was obvious!) essential, for this pro- 
cedure to be successful, that none of the six members of any group 
know one another Contact was made with nine colleges in the 
Boston area and pennission to recruit volunteers in each college 
was obtained Experimental sessions were then scheduled so that 
in each group only one person from any one college was present 
Thus, when the group met, the six members each came from a 
different college in the area and the chances of their knowing each 
other were quite low In spite of all these precautions however, 
one out of 13 groups had to be eliminated because two of the 
members did know each other, having gone to high school together 
In the other 12 groups all the members were complete strangers 
to one another 

Sdiachtcr (26), in an experiment dcsignetl to investigare the 
relationships between difference of opinion and rejection also 
wanted his groups composed of strangers to minimirc the effects of 
past history, sudi as establishcil preferences or aversions among 
members Having strangers vvas inifiortant liccausc he was panic 
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ulatly concerned with the effect of the experimental condition 
upon acceptance and rejection He recruited volunteers from courses 
which were divided into small reatalion sections By scheduling, 
in any one group, only one person from any one recitaiion section, 
he was fairly successful m eliminating prior acquaintanceship 

In both the examples above, ha\ing strangers compose the 
group was a technique used to exercise additional control over 
the experimental situation In experiments on the effects of dis 
cussion on opinions about matters of fact, Jenness (18) controlled 
the range of difference of opinion m the group by the assignment 
of subjects to given groups on the basis of their original estimates 
of the facts in question French (16), in an experiment on the 
effects oE frustration and fear, used the composition of the group 
as a means of manipulating a variable He was concerned with the 
differential effects of frustration and fear upon organized and unor 
ganized groups For his unorganized groups he used subjects re 
cruited at Harvard University who met together as a group for the 
first lime in his laboratory For his organized groups he used club 
members who had a long history of working together and engaging 
in activities as a group The members of each organized group came 
to the laboratory together This type of manipulation is of course, 
a gross one, since an organized group is different m many ways from 
an unorganized one The same type of manipulation of the com 
position of a group can, however, be used in any number of ways 
to produce fine or gross differences among conditions Some of the 
earliest experiments with groups, for example, emplo)ed as their 
major \ariable the presence or absence of other persons (1) Whether 
the person worked alone or in a group of people or before an audi 
cnce was found to affect his performance (8) 


Durchon of the Croup’s Existence 

Before recruiting subjects, it is necessary to decide whether the 
experiment will be conducted in one meeting or whether the group 
will be required to continue for seseral sessions Each of these 
procedure has advantages and disadvantages If the experiment is 
to be performed in only one meeting, it is generally easier to obtain 
volunteers If the experimenter is restricted to one session however. 
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It may be more diflicuh to manipulate variables adequately On 
the other hand, if the experimenter plans on more than one meeting 
per group, he must expeci that a certain percentage of subjects will 
not return after the first meeting 

Designs which require the group to meet several times encounter 
another difficulty Many uncontrolled factors may be introduced, 
since the subjects may contact one another outside the experiment 
and, in this way, materially change the situation between expert 
mental meetings The decision as to which of these two types of 
experimental designs to employ depends, again, upon the objectives 
of the experiment and on how these objectives can best be accom 
plished A number of examples of each kind of experiment will be 
given to illustrate the advantages and difficulties 

Deutsch (9), in his study of the effects of competitive and coop 
erative situations on group problemsolving felt that the full effects 
of the experimental variables would reveal themselves only if the 
group would have considerable experience working together under 
the prescribed conditions He decided on six successive meetings 
of each group and, to accomplish this, persuaded the instructor of 
a course to give students credit for participating in his experiment 
Under these conditions most subjects attended all six sessions Such 
an arrangement is not usually possible, but it is generally necessary 
to have some means of ensuring that subjects will return when the 
group is to meet several times 

Schachier’s (26) experiment on rejection of deviates used one 
meeting of each group It was necessary, however, for the subjects 
to be under the impression that they vvere to continue to meet once 
w v{e.x.V. tvyr w ptswjid Tte 

cruited subjects by telling them about clubs that were being formed 
and giving them iht opportunity to join one of the clubs Subjects 
vvere told that by joining they were committing themselves to attend 
the first meeting After the first meeting they v\ould be able to 
decide for themselves whether or not they wanted to continue 

In an experiment on strength of attraction to groups, Libo (22) 
used the number of meetings which subjects attended as one of 
the major measures of the strength of their attraction to the group 
He, loo, gave subjects an opportunity to voluniecr to join clubs 
which were to continue to meet every week Subjects could decide. 
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after the first meeung. whether they wanted to continue their 
membership Little pressure was applied to the subjects to return 
to subsequent meetings The number of meetings actually attended 
was assumed to reflect their attraction to the group 


Starting the Manipulation of a Vanable 

It is possible, and sometimes necessary, to start the manipula 
tion of an experimental variable at the time the subjects are 
recruited for the experiment This can be done by providing \an 
ous expectations for the subjects which will affect the attitudes with 
which they come to the experimental situation, or by collecting 
information which will later be used to manipulate a certain desired 
variable We shall give some examples of the experimental manip 
ulation of a variable which begins at the time of recruitment 

Several experiments (3, 14, 28) have varied attraction to the 
group experimentally by manipulating the degree to which the 
subjects expected they would like, and be congenial with the other 
memben of the group At the time of recruiting, those who volun 
teered to be subjects were asked to answer a number of questions 
which concerned charactenstia of themselves, characteristics which 
they liked m other people, and characteristics which they disliked 
in other people No attention was actually paid to these data in 
setting up a group but, because the subjects had provided such 
information, the experimenters were plausibly able to tell some 
groups that the members would like one another and be congenial 
and to tell others that they would not be very congenial The results 
of such experiments showed that the manipulations were successful 
Schacliter (26) in his experiment on the rejection of deviates 
wanted to manipulate attraction to the group on the basis of inter 
est in the activity in which the group was to engage When the 
subjects were ask^ to join one of the clubs each club was described 
in detail Those who desired to join filled out an information sheet 
on which they were asked to give ratings of how interested they 
were in joining each of the available clubs Some groups were com 
posed of subjecu who were highly interested in joining that specific 
club fhigh attraction to the group), whereas other groups were 
composed of persons who had indicated relatively low interest in 
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joining that specific club (low attraction to the group). ^ This manip- 
ulation of attraction to the group was also shown to be successful 
by the results and by answers which subjects made to questionnaires 
after the experiment. 


Size of the Experimental Groups 

No matter what techniques the experimenter employs, there will 
always be some subjects who. after having agreed to be at the lab- 
oratory at a certain time, will not appear. They may have forgotten, 
they may have changed their minds, or something may have hap- 
pened which made it impossible for them to attend. In any event, 
the problem for the experimenter is the same. In designing a lab- 
oratory experiment in which human subjects are to be used, it is 
well either to design the experiment so that it may be conducted 
with a variable number of subjects or to make some provision to 
ensure the proper number of persons in each group. It is generally 
most desirable to allow for variation in the number of subjects. 
Thus, for example, an experiment may be designed so that it can 
be conducted with either five, six, or seven members in the group. 
If seven penons are then scheduled for each meeting, and if sufficient 
precautions are taken,* very few groups will be lost. 

When a design requires a constant number of subjects in each 
group, there are a number of techniques to ensure the presence of 
the proper number. Fcstinger (10), in his experiment on the effects 
of knowledge of religious affiliation, felt it necessary to keep the 
size of the groups exactly constant at six subjects per group. Three 
of these were to be Jewish and three Catholic. This tvas essential 
because of the desire to have the group evenly divided between the 
two religions. Leeway in the number of subjects in each group 
would have produced deviations from an even division which might 
have introduced additional complexities. Before each experiment 

t *rhu is not strictly an experimental manipulation of a s'ariabie. Rather, 
it rtpresenu seleaion of subjects on the basis of some measure in order to create 
contrasting conditions. 

* There arc many factors which will affect the proportion of subjeco who. 
having votuntecred, actually come to the experiment. If, for example, volunteers 
are recruited from univentty classes, the more pressure applied upon them to 
partleipate, the lower the proportion of subjects who appear when idieduled (27). 
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each subject was svritten a letter stressing the importance of her 
coming to the experiment. On the day before the meeting, each 
subject was spoken to by telephone to make sure that she would 
be present. In spite of these efforts, only five subjects appeared in 
a number of groups. In most of these instances the subjects who 
had arrived agreed to wait vfhile others who had volunteered were 
telephoned until an appropriate person was reached who agreed 
to come dotvn immediatefy. By this procedure very few groups had 
to be discarded. In Pepitone's (25) experiment on group produc- 
tivity, the situation was designed so that it was essential to have 
three subjects present in each group. The group was to work on a 
task which was divided into three pans, each of which had to be 
performed by one subject. The experimenter scheduled four sub- 
jects for each group. Occasionally only two subjects appeared and 
the group had to be canceled; most frequently three subjects 
appeared. When all four came, the last one was taken aside, the 
situation was explained to him, and he was allowed to observe the 
experiment in progress. 


THE COHTENT AND FORM OF THE 
EXPERIMENTAL SITUATION 

The investigator must make a number of decisions concerning 
how the situation is to be structured cognitively for the subjects, in 
what kinds of activities they will engage, and with what attitudes 
they come to the experiment. 


"Rral” or"Expcriment(il” Situations 

The experimental situation can vary from one which is frankly 
experimental to a situation which, for the subjects, is a "real” one. 
The pros and cons lor the various possibilities within this range are 
by no means all clear. Good evidence is lacking concerning which 
types of experimental situations are superior for which purposes. 
We shall, however, discuss some of the considerations svhich might 
lead an experimenter to set up his groups In one or another manner. 

To discuss these advantages and disadvantages we must explain 
somewhat farther the distinction between a situation which is 
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"real ’ for the subject and a situation which is "experimental ’ 
for him All of the situations are, in a ‘:ense, real for the 
subject, and all of them, likewise, are experiments from the 
point of view of the investigator Some examples from other fields 
of investigation may illustrate our point more clearly If a psychol 
ogist does an experiment in discrimination learning, using rats as 
subjects, the situation is obviously an experimental one for the 
investigator For the rat, however, it is undoubtedly a very real 
situation The ma2e or discrimination box is a place where he works 
and gets fed The basis of the reality ' of the experimental situation 
for tlie subject is somewhat less clear when humans are used as 
subjects Thus, for example, in an experiment on level of aspiration 
the subject may come to the laboratory knowing he is to help in 
an experiment He is given a senes of tasks to perform and is asked, 
before each task, what he is going to try to score on the subsequent 
task One may well ask In what sense is this situation a real one for 
the subject? Certainly it is not ‘ real in the sense that it is a situ 
ation similar to those vshich the subject encounters in the ordinary 
course of events, on the other hand, it is certainly ’ real in the sense 
that powerful motives are brought into play and strong forces are 
set up which act on the subject and determine his behavior m 
lawful ways Thus, the situation in which one places the subject 
can be real ’ for him m that it brings into play powerful forces, 
regardless of whether or not it is cognitively an experimental situ 
ation for him 

If the situation is cognitively a real one for the subject, it is 
probably easier to bring powerful forces into play It may be more 
difficult to produce equally strong forces if the situation is cog 
nitively experimental In the latter case he strength of the forces 
which can be brought into play depends largely upon the relations 
between the subject and the experimenter, the motivations which 
made the subject decide to volunteer for the experiment and his 
desire to cooperate These forces can in the proper circumstances, 
be quite strong It is mudi easier to create a laboratory situation 
which IS cognitively experimental for the subjects To create a 
cognitively * real situation and still be able to control and manip 
ulate variables successfully ma) require a great deal of subterfuge 
and much attention to technical details If the subject sees through 
the subterfuge the whole experiment ma) be invalidated 
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We have, then, these relative ad\antages and disadvantages 
which the experimenter must consider when deciding whether to 
mate the experimental situation cognitively experimental or cog- 
nitively real for the subject. If the experiment is cognitively real, 
it will be easier to make it motivationally strong. On the other hand, 
if the situation is cognitively experimental, it will be easier to set 
up with an adequate amount of control and precision. The examples 
below illustrate the kinds of derisions which have been made on this 
question. 

Lippitt (23), in his experiment on the effects of the behavior 
of autocratic and democratic leaders, chose to make their experi- 
mental situation cognitively real for the subjects. To do this he 
organized school-age children into clubs which had their club 
rooms in the investigator’s laboratory. The experimenter functioned 
as the adult leader of these clubs. In this role he was able to manip- 
ulate the desired variables. Because of the desire to maintain a 
cognitively real situation, the possible variations in the leader’s 
behavior were also limited. The differences between conditions that 
were produced were rather gross. It is possible that the lack of 
control and precision in this experiment offset the advantages gained 
by having a cognitively real situation. 

Schachter (26), in his experiment on rejection of deviates, also 
chose to have a cognitively real situation for the subjects because 
the major measures of rejection were to be obtained from verbal 
responses to questions. The investigator fell that these responses 
would have more validity if they were commitments to action on 
the part of the subject rather than answers to hypothetical questions. 
To obtain a cognitively real situation, he organized clubs of college 
students. 

Once more a major difficulty was the restriction on the manip- 
ulation of variables. Manipulations had to be devised which were 
consistent with the notion of a bona fide club. To create groups 
with high and low cohesiveness, the investigator first ascertained 
the degree of interest of the subjects in each of two kinds of clubs 
and then manipulated the attraction to the group by composing 
some groups of persons who were all highly interested in the activity 
and other groups of persons who were only mildly interested. This 
type of manipulation of a variable by selection is probably not so 
satisfactory as other techniques would be. Because of the cognitively 
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real aspects of the situation, it was also not possible for the experi- 
menter tc engage in any further manipulation of variables while 
the meeting of the group was in progress. These problems, in this 
experiment, were satisfactorily solved by the use of paid participants, 
a technique which will be described later. 

In an experiment on the effects of knowledge of religious 
afRliation Festinger (10) decided to use a cognitively experimental 
situation. This decision was made because it was obviously of 
importance to control the group session firmly and to carry on 
manipulations of variables while the session was in progress. The 
group consequently met with the knowledge that it was helping 
in an experiment. They were told to ‘‘imagine” that they were a 
club. There is no doubt that the forces in this situation were weaker 
than the forces which would have operated had the subjects actually 
been members of a club engaged in the same procedure. By virtue 
of the cognitively experimental aspeas of the situation, however, 
this disadvantage of weaker motivation was counterbalanced by the 
precision of measurement and the control of extraneous variables. 

The Choice of Activity for the Group 

The choice of the activity in which the group, once assembled 
in the laboratory, is to engage is somewhat dependent upon the 
decision concerning the cognitive reality of the experiment. There 
is, of course, much leeway in the choice of activity, although it must 
be one which is consistent with the purposes of the experiments 
and does not conflict with the other experimental decisions which 
have been made. If the experimental situation is to be cognitively 
real, there are restrictions on the type of activity which can be 
employed. If the situation is to be cognitively experimental, there 
is much less limitation and the selection of an activity which is well 
suited to the experimental purposes is easier. The activity must be 
chosen to allow for the manipulation of the variables, the collection 
of the measures in which the investigator is interested, and the 
arousal of sufficiently strong forces so that the effects will be meas- 
urable. It is impossible, of course, to list all of the various activities 
in which laboratory groups may engage. We shall present a few 
examples of different kinds of activities which have been used and 
the reasons for their use. 
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Perhips the most frequently employed group activity is discus 
Sion Such an activity may be chosen when the purpose of the experi 
ment is either to study the involvement of people in an activity, 
the amount of participation in an activity, or the communication 
or induence process that goes on in groups or to provide a relatively 
interesting activity which will involve the subjects in order for the 
experiment to accomplish some other purpose in the meantime 
Any topic which will be interesung to the subjects is suitable The 
discussion may concern differences m opinion, as in the experiment 
by Back (3), it may be directed toward solving a problem, as m the 
experiment by Deutsch (9) or it may involve a sharing of expen 
enres as m the study of Fesiinger, Pepuone, and Newcomb (15) 
When children are used as subjects, a play activity may fre 
quently be appropriate Thus Thibaut (30), when he endeavored 
to create privileged and underprivileged subgroups had one sub 
group play an interesting and enjoyable game while the other 
subgroup took the role of helpers and servants to those who were 
actively engaged in having fun Lippitt (23), in his experiment on 
autocratic and democratic leader behavior, used various games and 
craft activities which were appealing to school age children 

It IS also possible to use work situations as the activity for the 
group Kelley (19) felt that a work situation would be more con 
ducive to the establishment of a status hierarchy, so in order to 
create a two level status hierarchy he used a work task in which the 
subjects had to arrange bricks according to a certain pattern Pepi 
tone (25), m an experiment on group productivity, used a work task 
which was constructed so that measures of production would be 
relatively easy to obtain 

These are but a few of the many possible examples of activities 
that can be prepared for a group There is almost limitless room 
for the experimenter s ingenuity to create a situation which will be 
best for his experimental purpose 


The Orientation of the Subjects 

Related to both the cognitive nature of the situation and the 
activity in which the group is to engage is the problem of what 
orientation to gi\e the subjects in the experiment It is highly de 
sirable to ha\e some plausible and understandable purpose for the 
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experiment \vhich the investigator can communicate to the subjects 
and ivhich they will accept. If this is not done, the subjects usually 
conjecture about it and make guesses as to the true purpose. If a 
plausible orientation is not given, this important aspect remains 
uncontrolled. 

The orientation which the experimenter gives the subject at 
the beginning should be plausible and should remain plausible as 
the experiment progresses. It is usually important that this plausible 
orientation nof reveal to the subject the true focus of the experi- 
lent. The true purpose of the experiment and the true focus of the 
investigator’s interest can, and should, be revealed to the subjects 
at the conclusion of the experiment. 


TECHNIQUES FOR THE CONTROL AND 
^ MANIPULATION OF VARIABLES 

Since the basic purpose of a laboratory experiment is to achieve 
a simple situation in which certain variables can be well controlled 
while others can be varied at will, we shall attempt, in the present 
section, to be as detailed as possible. We shall illustrate not only 
the various techniques which have been developed for controlling 
and manipulating variables but also the kinds of variables which 
have been successfully controlled and manipulated in the laboratory. 


Use of Pre-experimenial Instructions 

The most obvious technique for controlling or manipulating 
variables is the use of pre-experimental instructions to the subject. 
Such pre-experimental instructions vary greatly in their elfective- 
ness. It is probably safe to say that instructions to the subjects will 
be successful in manipulating variables when these instructions are 
kept simple, are given emphatically, and are plausible in the sense 
of being integrally related to the experimental activity in which 
the subjects are to engage. The major dangers in the use of instruc- 
tions as a device for manipulating variables are (1) the possible 
inattention of the subjects when the instructions are given and 
(2) the possible variability from subject to subject in interpretation 
of the instructions. Because of these difficulties, it is probably unde- 
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sirable to manipulate more than one variable at a 
the me of pre-experimenul instructions. Instructions which attempt 
to manipulate several variables simultaneously 
so complex and so long that they render the manipulation ineffec 
live We shall illustrate the problems involved in the use of instruc- 
tions by giving examples of successful and unsuccessful attempts at 
manipulating variables in this manner. 

Deutsch (9), in his experiment on competitive and cooperative 
groups, produced competitive or coo|Mrative situations by 
ential instructions to the groups. In the competitive groups he told 
the subjects that all the members would be ranked according to 
their contributions in solving the problems given to the group and 
that their grades in the course would depend in part upon these 
rankings. It was explained that, thus, the one in that group who 
contributed most, irrespective of how the group as a whole per- 
formed, would get the highest grade and the one who contributed 
least would get the lowest grade. In the cooperative groups the 
experimenter told the subjects that their group was going to be 
compared with other groups, that everyone in the group would 
receive the same grade, and that thk grade would be determined 
by how well the group as a whole did. These instructions were 
successful in creating the required conditions, and they provide a 
good example of how instructions can be integrated into the expen- 
mem. They were successful because they provided essential explana- 
tion of the situation to the subjects— they defined the goals for the 
subject and defined the manner in which these goals w6re to be 
reached. 

Back (3), in his experiment comparing groups of high and low 
cohesiveness, wat^d to vary the attraction to the group by using 
severanc>~~-pf pared jn some groups he wanted to create 

0" the basi. of personal 
best tor his experimental purpose.’ attraction, he 

* the information they had 
The Orientation of the Subjects 'd matched people in this 

Related to both the cognitive nature o?® “"genial and like 
activity in which the group is to engage is^i' ' i, them 

orientation to give the subjects in the experin^ f I, nther 
sirable to have some plausible and understanda'J'^'^yjgj^j ^pon 
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the personal goals that could be achieved through membership. 
This was done by informing the subjects that there was (or was not) 
a reward that would be given as a prize to each of the members in 
the best group. 

These instructions were probably moderately successful. On the 
one hand, they were not integral to the experimental task. That is, 
the subjects could have done everything the experimenter required 
of them without these instructions ever having been given. The 
possibility of winning a reward or the likelihood that members 
would get along well with others in the group was, however, relevant 
to fairly important motives in the subjects. They probably were 
concerned about whether or not they would like the other persons 
and be liked by them. The possibility of a reward probably added 
to the motivation to do well in the eyes of the experimenter. The 
results of the experiment show that a difference between high and 
low attraction was created by means of these Instructions. 

In an experiment on the direction of communication in a 
group, Festinger and Thibaut (12) wanted to manipulate the sub* 
ject’s perception of the homogeneity or heterogeneity of the group. 
To create the perception of homogeneity, groups were told that the 
members had been carefully selected so that they were all in the 
same year in college and had equal interest in, and knowledge about, 
the problem they were to discuss. To create the perception of a 
heterogeneous group, they were told that great differences existed 
among them in their knowledge about, and interest in, the problem 
under discussion. The manipulation of the variable by these instruc- 
tions was only mildly successful. Probably few of the subjects were 
much concerned with whether the group was homogeneous or heter- 
ogeneous. Although differences between these conditions were ob- 
tained in the results, these differences were by no means strong. 
It might be expected that a more adequate manipulation of these 
variables would have produced much larger differences between the 
conditions. 

In an experiment by Festinger et aL (14), an attempt svas made 
to manipulate three variables simultaneously, all by means of verbal 
instructions at the beginning of the experiment. The investigators 
were interested in the interaction among the variables of attraction 
to the group, perception of whether or not there were experts in 
the group, and perception of whether or not there s^'as a correct 
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answer to the discussion problem. Tliis attempt to manipulate all 
three variables by pre-experirocmal instruction was not very suc- 
cessful. The amount of instruction which had to be given to the 
subject and the complexity of the instructions rendered them rather 
ineffective. It probably would have been better to manipulate one 
of these variables by instructions and to have devised techniques 
for manipulating the other two in other ways. We shall discuss 
below such other techniques of manipulating and controlling 
variables. 

Use of False Reporting 

False reporting to the subjects of the results of votes or of 
sociometric choices and the like is another technique for control 
and manipulation of variables. Such false reporting must always be 
done in a manner which will make the report appear plausible. 
U sufficient care is used to ensure the acceptance of the report as 
true, this can be an effective means of manipulating some kinds of 
variables. 

Festinger (10), in his experiment on the voting behavior of 
Catholics and Jews in mixed groups, used the technique of false 
reporting to the subjects to keep the situation identical for all 
groups. The members of the group voted for officers of the club in 
the following manner. There was first a nomination ballot to select 
two candidates for the election. The members of the group who 
received the most votes were to be the candidates in the final elec* 
tion. This nomination ballot was tabulated by the experimenter 
and, since the ballots were secret, it was simple for him to report 
falsely which two members had won the nominations. In this man- 
ner the experimenter was able to control which two persons were 
the candidates in each election. This experiment also employed 
paid participants (the use of whicli will be elaborated' below) who 
were members of every group. By means of the false reporting'of 
the results of the nomination ballot, the two candidates for each 
election in every group were two o£ the paid participants. One of 
the two candidates in each election identified herself as Jewish and 
the other identified herself as Catholic. Each election in each group 
was, thus, a standard situation. 

In the experiment by Festinger et ol, (14) in which an attempt 
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was made to manipulate simultaneously three variables by verbal 
instructions to the subjects, a fourth variable was manipulated suc- 
cessfully by means of false reporting to the subjects. The subjects 
were to have a discussion among tliemselves concerning an issue 
about which each of them had already formed an opinion. Before 
the discussion some subjects were given the impression that the 
group overwhelmingly agreed with their own opinion on the issue, 
whereas other subjects were given the impression that the group 
ovenvhelmingly disagreed with them. This was done in the follow- 
ing manner. Each subject wrote, on a slip of paper, his opinion on 
the issue which was to be discussed. Subjects were told that the 
experimenter would tabulate these and then give each person a 
tally which would show the opinion of each person in the group. 
Thus, knowing everyone's opinion, they would be able to proceed 
sensibly with their discussion. The tally which was handed to each 
of the subjects was entirely lictittous. Each of the subjects in whom 
the perception of group agreement was to be created was given a 
tally which showed all but one of the subjects agreeing very closely 
with him. Each of those in whom the perception of disagreement 
with the group was to be created was handed a tally sheet which 
showed everyone in the group at least two opinion steps removed 
from his own' opinion. This false reporting proved successful in 
varying the degree of perceived agreement with the group. 

We shall conclude the discussion of the technique of false 
reporting to subjects with an illustration of an unsuccessful attempt. 
Festinger and Hymovitch (13) attempted to create in subjects a 
feeling of rejection by the group. Four subjects, strangers to one 
another, met in the laboratory and were told that they were to 
work on a task which required cooperative effort, although the 
various parts of the task would be divided among them. They were 
first to have a brief discussion among themselves and get to know 
one another so that they could decide how they wanted to organize 
the task. They were told that people who liked one another worked 
more productively together. Consequently, if there was any one in 
the group that they disliked, it would be better to exclude that 
person from the group- After the discussion, the subjects were given 
ballots on which each could indicate whether he wanted to work 
together with all the others or wanted to eliminate a member from 
the group. If subjects chose the latter alternative, they wrote dovvrn 
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the name o! the member they wanted to reject Each subject was 
then taken to a separate room and was told that the experimenter 
would tell him the results of the ballot as soon as possible Each 
subject was then privately told that the others had unanimously 
voted to reject him 

This false report to the subject was rarely successful The over- 
whelming majority of the subjects refused to accept it and imme- 
diately suspected that the experimenter was not telling the truth 
The reason for the failure were probably twofold The experience 
with the others in the preliminary discussion did not provide 
grounds on the basis of which they could accept the reported rejec 
tion Also, the false report was unpleasant enough so that the sub 
jecis did not want to accept it Many subjects refused to accept the 
report even though they could not verbalize any reason for suspicion 
or disbelief This technique had to be abandoned in this experiment 


Use of Paid Partictpants 

The use of paid participants who are part of the experimental 
group and are accepted as such by the subjects is a powerful tech 
nique for the control and manipulation of variables It is, however, 
a re ative y expensive and tedious procedure When paid partici 
pants are used, the details of their behavior must be exactly planned 
ti! ^ much time must be spent training and rehearsing 

em c 8 all give some examples to illustrate the great variety of 
uses to which such paid participants may be put 

simple and effective use of paid partiapants to 
ipulate a variable is found m an experiment by Sherif (29) 
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Pcpitone (24) reports an experiment in which he investigated 
the determinants of the perception of authority and approval in 
people. He was faced with the problem of how to provide a standard 
social situation for his subjects in which it would be meaningful 
to ask them for their perceptions of authority and approval. Using 
school-age children as his subjects, he let it be known in the school 
that, as part of a survey on interest in athletics, a three-man board 
would arrive in a few days to interview many of the students. Those 
who successfully answered the questions asked by the three-man 
board would win tickets to a college basketball game. The three- 
man board which came to the school and interviewed students 
individually consisted actually of three paid participants who had 
been trained by the experimenter. Scripts for each of the three 
had been carefully written so that each boy who was interviewed 
was asked exactly the same questions. The responses to the boys' 
answers were also standard for each of the conditions. In different 
conditions, however, the experimenter created authority differentials 
among the three board members and also differences among them 
in the extent to which they openly voiced approval of the boy who 
was being interviewed. The boy’s perception of the relative author- 
ity and approval among the board members could be ascertained 
in an interview with each boy directly after his appearance before 
the three-man board. Thus, the experimental situation was effec- 
tively standardized. 

Schachter (26), in his study of rejection of deviates, had three 
paid participants in each group. The topic for discussion was chosen 
so that all of the subjects would have opinions which very nearly 
agreed with one another. Paid participants were used to create 
various conditions of deviation from this group norm. One paid 
participant voiced an extremely deviant opinion and^ held to it 
diroughout the discussion. Another paid participant voiced a devi- 
ate opinion at the outset but allowed himself to be influe^ed so 
that, in the end. he agreed with the other subjects. The third paid 
participant agreed at the beginning and continued to agr^ wit 
the modal opinion in the group. Thus, standard conditions o evia 
tion from the group norm were achieved and, by rotating t e pai 
participants among the various roles from group to group, it was 
also possible to equate for personality factors. We must emp «ire 
that these paid partiripants had been very carefully trained m how 
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to behave in the group and in what kinds of things they could 
and could not say. 

In the study by Festinger (10) of the effect of knowledge of 
religious afHliation, four paid participants were members of every 
group which met. These paid participants were relied upon to 
control many variables and to create a standard situation. In the 
middle of the experiment, when everyone was identified according 
to her name and religious aSiiiation, two of these paid participants 
announced that they were Catholic and two announced that they 
were Jewish. The ones who said they were Jewish or Catholic were 
rotated from group to group so that actual religious affiliation and 
personality differences were equated among all the conditions. In 
this manner, many powerful variables, which would affect prefer* 
ences for people, were controlled and the effects of knowledge of 
religious affiliation were permitted to emerge quite clearly. 

The three foregoing examples of the use of paid participants 
in laboratory experiments hardly demonstrate adequately the pos- 
sible range of uses to which this technique may be put. With suffi- 
cient ingenuity on the pan of the experimenter and sufficient time 
in planning the behavior of the paid participants and in adequately 
training and reheaning them, very powerful effects can be produced. 
There is ample evidence of the success of the control and manipula- 
tion of variables with the aid of paid participants. 


Restriction of Behavior Possibilities 

It is possible to exercise control over a situation and to manip- 
ulate variables by creating a situation which restricts the possibilities 
of behavior. 

Festinger and Thibaut (12), in their experiment on the de- 
terminants of direction of communication, restricted the group to 
the use of wntten notes in carrying on their discussion. This decision 
was made for a number of reasons. If the discussion had been an 
oral one, the direction of communication (who spoke to whom) 
would have had to be recorded by observation of the group while 
the dis^ssion was in progre». Such observation in fairly large 
groups is difficult and sometimes quite unreliable (see Chap. 9). By 
the use of written notes, a permanent record was immediately avail- 
able. The exact time each note was written was recorded on it 
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belore it was delivered to another group member, so that the whole 
communication process could be reconstructed in the analysis. Aside 
from these measurement problems, there were other reasons for re- 
stricting the discussion to written notes. In an oral discussion, the 
person who is talking may be primarily addressing one or two others 
in the group, but, whether he likes it or not, what he is saying is si- 
multaneously heard by everyone. This introduces additional complex- 
ities. By limiting the communication process to written notes, with the 
further restriction that each note could be sent to only one person, 
the situation was kept simple and manageable. A further difficulty 
in using an oral discussion for the purposes of this experiment is 
the marked tendency for people to anssver when remarks are ad- 
dressed to them. This is fully demonstrated by the usually high 
correlation obtained between the number of times a person com- 
municates to o^ers and the number of times he is the recipient 
of communication (17). Since the experimenters were concerned 
primarily with other determinants of the direction of communica- 
tion, this would have been a complicating factor. The further restric- 
tion that the written notes could not be signed avoided this com- 
plication. The redpient of a note did not know from whom it came. 
The pads of paper on which the subjects wrote their notes were 
marked so that later, in the analysis, the experimenter could tell 
who had written each note as well as to whom each note was 
addressed. 

In his experiment on communication in a status hierarchy, 
Kelley (19) also restricted communication to written notes. Again 
there were a number ol functsom served by this xestriciiDn on the 
communication process. First, the experimenter intercepted all the 
notes written and thus had a detailed record of the communication 
process. Secondly, since all communication was by written notes, 
the experimenter could easily manipulate the communication proc- 
ess. Actually, none of the notes which the subjects wrote to one 
another was delivered. The notes which they received were fictitious 
ones designed to produce certain effects. In this manner a standard 
pattern of receiving communications from others was established 
for every group in all of the experimenter’s conditions. 

Restrictions on the behavior of the group can also be produced 
by an appropriate activity in which the group must engage. An 
activity can be chosen to eliminate certain complications, restrict 
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the range of behavior, or produce certain reactions in the subject. 

French (16). in his experiment on the effects of frustration ana 
fear on organized and unorganized groups, produced frustration in 
his groups by means of the activity in which they engaged. The 
groups were put to work on a task which was impossible to com- 
plete. The frustration engendered in this manner was unmistakable. 

In his experiment on the relationship between influence 
and group cohesiveness, Back (3) wanted to produce a situation 
in which two subjects, meeting together, had different interpre- 
tations of, or opinions about, the same set of facts. Before they 
came together, each subject was given a set of three pictures 
and asked to write a story about them. Each of the subjects 
was actually given different pictures, which would force different 
interpretations. The differences betiveen the sets of pictures, how- 
ever, were so slight that none of the subjects ever suspected that he 
had seen different pictures. In this manner, by appropriate choice 
of activity, Ba6k was able to ensure that, in every group, there svould 
be a difference of opinion between the two subjects at the beginning 
of their discussion. 

In experiments by Bavelas (5) and his colleagues (20) on the 
effectiveness of different patterns of communication in groups, a 
technique has been employed which is perhaps the most extreme ex- 
ample of restriction in a situation. In these studies the experimenters 
were concerned with determining which of a number of patterns of 
communication among members of a group would result in more 
effective problem-solving. To produce the different patterns of com- 
munication, the experimenters allowed some members to communi- 
cate to one another and prevented others from doing so. By this 
simple restriction, on which channels of communication were or 
svere not available, various communication patterns were estab- 
lished. In these experiments the purposes of the investigators and 
the artificiality of the manipulation device were not hidden from 
the subjects. The restriction of the situation, however, was such 
that the subjects had to behave within it as well as they could. The 
results of these experiments show that the manipulation was suc- 
cessful. Such extreme and frank restriction of the situation would 
be appropriate, of course, only for a relatively selected range of 
problems. 

In the foregoing discussion, we have by no means covered 
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exhaustively the various kinds of techniques for the control and the 
manipulation of variables. Those described are no more than a few 
examples of the wide variety of which an experimenter can avail 
himself. Many more possible techniques are likely to be developed 
in the near future. It should again be stressed that when one em* 
ploys neiv techniques for manipulation of variables, or even some 
of those already developed, it is important to conduct preliminary 
experimentation to make sure that the manipulation is actually 
working. 


OPPORTUNITIES FOR MEASUREMENT IN 
LABORATORY EXPERIMENTATION 

Opportunities for collecting data in a laboratory experiment 
are present at all phases, from the recruiting of subjects until the 
end of the experimental sessions. There are, of course, some resiric* 
tions on what kinds of measurement can be employed at various 
phases in this process. These depend upon the design of the experi* 
ment and the way in which It is cognitively structured for the 
subjects. We shall point out some of the measurement possibilities 
at each of the stages of a laboratory experiment. 

The first opportunity for measurement occurs before the experi- 
mental session takes place. Such measurement may be made at the 
time of recruiting subjects or when the subjects have assembled in 
the laboratory but before the experiment has begun. The exact 
time at which the measurement is done is immaterial and is gen- 
erally selected for convenience. Such meas\iremcnts, using a ques- 
tionnaire or an inierview, can have the following purposes: (I) to 
obtain some measure which will be compared to a similar one taken 
during or after the experiment; and (2) to enable the experimenter 
to control a variable by manipulating the composition of the 
group according to these measures. 

In some experiments, it is essential for data to be collected 
before the experiment began. Thibaut (30), in his experiment on 
the cohesiveness of privileged and underprivileged subgroups, em- 
ployed pre-experimental measurements to equate groups in the 
experiment and also to have a comparison between a pre-experi- 
mental and a postexperimenial measure. The subjects were members 
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of already existing clubs. The investigator met the group at some 
designated place, usually their Y.M.C.A. or their club. He provided 
iransportation for them to the experimental rooms. Before setting 
out for the laboratory, he asked them to answer a questionnaire 
concerning who their friends were among the other boys. He then 
brought them to the experimental rooms and was able to divide 
them into two subgroups so that each person had about as many 
of his friends within his own subgroup as in the other subgroup. 
After the experiment was concluded, the boys were again asked 
to answer the same sociometric questions. In this manner the inves- 
tigator was able not only to equate hU subgroups for amount of 
friendship within them but also to provide a basis for determining 
the effect of the experimental proc^ure on this variable. 

Most of the possibilities lor measurement occur, of course, 
during the actual progress of the experiment. One of the most fre- 
quently used measurement devices is observation of the group as 
it carries on its activities (dealt with in detail in Chap. 9^. We shall 
discuss here some of the other kinds of data collection which are 
possible during the experiment. 

The product of the activity In which the group engages is a 
major source of data. This product may take any of a variety of 
forms and may be analyzed in various ways by the investigator. 

Kelley (19), in his experiment on communication in a status 
hierarchy, had his subjects arrange bricks in a certain pattern on 
the floor in accordance with instructions communicated to them. 
The actual product— that is, the exact pattern of bricks with which 
the group finished— was recorded by the experimenter and was used 
to obtain a measure of adequacy of production. 

In his experiment on competitive and cooperative groups, 
Deutsch (9) had the subjects discuss, and write solutions to, various 
human-relations problems. He then analyzed these written products 
of the group discussion to obtain measures of the adequacy of the. 
solution to the problem. 

Closely related to such products are various records which the 
subject makes in the process of doing the required activity. Thus, 
in the Kelley (19) experiment and in the Festinger and Thibaut 
(12) experiment on direction of communication, the actual notes 
which the subjects wrote while carrying on the discussion were the 
main source of data. 
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Questionnaires and interviews may also be used during the 
course of the experiment. These may take the artificial form of 
questions interpolated into, and momentarily interrupting, the 
experiment or they may be disguised as election votes or expressions 
of opinion necessary to the conduct of the experiment. 

Schachter (26), in his experiment on rejection of deviates, 
created a situation which was cognitively real to the subject. The 
groups were clubs which the subject had joined and which the 
subject expected would continue meeting periodically. It was 
fitting, consequently, to ask the subjects to elect committees to carry 
on various of the dub functions and to vote on when and how often 
the club should meet. In this experiment, the data collection was 
seen by the subjects not as such but rather as part of their function- 
ing as members of a club. 

In the Festinger (10) experiment on mixed Catholic and Jewish 
groups, the major data were collected by holding elections for officers 
of a club. Here the situation was cognitively experimental for the 
subjects and the voting was undoubtedly seen as part of the experi- 
mental procedure. The results indicate it to have been an adequate 
method of data collection. 

One can also collect a wide variety of data by questionnaires, 
interviews, or tests at the conclusion of the experimental session. 
The techniques of such data collection are discussed in Chapters 8 
and 9. 


SUMMARY 

Laboratory experiments institute a powerful lechnique^for 
invwiigaung rdationsUips among variables. T^e of such 

^periments may be desc ribed as observing the effect on a_dep encf^ t 
Variable jiLJii!e:LjrhaQt p_ulation o f a n ind^endent variable und er 
controlled co n ditions. Such experiments, if well designed, can pro- 
duce dear and unambiguous results which may add to a theoretical 
body of knowledge. 

It is important to remember, however, that laboratory experi- 
mentation, as a techni que for the develo pment of ap t^mpirical h oHy 
gf knoiri ^ge. cannot exist by itself. Experiments in the laboratory 
must d^vc their direction from studies of real-life situations. 
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and results must continually be checked by studies of real-life sit- 
uations. The laboratory experiment Is a technique for basic and 
theoretical research and Ts riot the~gpal~-of~~a n~empirical science. 

We have, in this chapter, enumerated in some detail many 
techniques for designing laboratory ‘experiments and for manipu- 
lating different kinds of variables in a variety of ways. Many of these 
techniques for the manipulation of variables involve deception, 
prevarication, misdirection of subject, and the like. As long as an 
investi^tor works with human subjects, it is impossible to over- 
emphasize the necessity for keeping in mind the responsibilities 
to the subject and the ethics which the experimenter must follow. 
It is important, if such experimentation is to continue and is to be 
tolerated by the people who help in it, that the experimenter per- 
form a service to the subjects in exchange for their help. In all 
laboratory experiments it should be a firm policy to give the subjects 
a full explanation at the conclusion of each experiment. This some- 
times requires spending more time explaining and discussing mat- 
tfers with the group than it took to do the experiment. If it is done 
well, the subjects leave feeling that they have learned something 
and have not wasted their time. The subjects do not resent having 
been misdirected and deceived if they can see the reasons for the 
deceptions and understand the purposes. 
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PART II 


Procedures for 
Sampling 


Before we proceed from research settings to the collection 
of data, it is necessary to pause for a consideration of sampling- 
bow it is done and what its implications are. The question of sam- 
pling may be simply stated: how is the investigator going to decide 
what persons or groups or organizations or communities to use for 
the collection of his data? The way this decision is made will affect 
the conclusions which may be drawn and the precision of these 
conclusions. 

Many investigators may protest that they do not make these 
decisions— that these decisions are made for them. One does research 
in industries into which one can get entree, one uses as subjects in 
laboratory experiments those people who volunteer, and the like. 
But such situations, frequent though they may be, do not obviate 



the necessity for a consideration of the nature of the sample and its 
characteristics. 

Sampling theory has made enormous strides in recent years, 
mainly in connection with the problems arising from large-scale 
survey operations. It is, consequently, easiest to talk, about sampling 
in connection with the problems of surveys, and It is easiest to see 
the applications in that context. But the applications exist else- 
where, too, and they must be discovered and used. As a single 
example, the reader may notice that the discussion of cluster sam- 
pling in the following chapter is quite relevant to laboratory group 
experiments where perhaps twenty groups of six persons eadi are 
the subjects of data collection. 
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There is no hard and fast rule for deciding just where the 
specialized work of “sampling” begins and ends; the term has been 
used with various meanings in different contexts. It seems convenient 
to exclude the processes and problems of making observations (or 
measurements) from the area of this chapter. But it mtist be emphasized 
that this exclusion is arbitrary— that the problems of errors of response 
and nonresponse have bearing on the sample design (see Section 16). 

Assume, then, that a method of observation has been decided on 
whereby the value of some characteristic may be obtained for any of 
the N elements (members, individuals, cases) which comprise the 
population (or universe). A target of empirical research is usually 
some numerical expression which summarizes the information about 
the characteristic from all iht N elements of the entire population— 
i.r., a parameter, a population value. An example of such a parameter 
1 A' 

is the mean: Je ^ 2x.* Moreover, the mean is an important and 
commonly known and used pararnctcr. Therefore, and for convenience 
and brevity, much of the discussion to follow will center around it. 
However, many of the general remarks concerning the mean will apply 
also to other parameters. 

To obtain the exact value of a parameter, observations have to be 

•The number of elemeno in the paf>ut«(ion ii denoted here by K The symbol 
* itaods for the values of the variable characteristic. Then there are N separate 
values denoted by the general term x. The symbol Zx i unds for the turn of the N 
values of x-tbat U, for x, -f Xi • * * xx 



Jures for Sampling 


,A the elements in the entire population. However, seldom 
the obtained about all members of a population large 

cnara'’^o interesting. Usually practical considerations, particularly 
^-ost, force us to be satisfied with making inferences about population 
/alues from data which are based on a sample only. What are the tasks 
facing the researcher in planning a sample in order to make inferences 
about the population values? He must (1) select a sample, (2) make 
observations, and (3) compute the that is, estimates based on 

the sample data. 

These sample values are of little interest in themselves. They are 
worth obtaining and are of interest to us only in so far as they yield 
information about the corresponding values in some population. Thus, 
the selection of the sample and the process of estimation are tvv'o tasks 
which arc incurred joindy because the population values are estimated 
from a sample. Let us call the joint procedures of selection and estima- 
tion the sample design. 

iThis chapter will concentrate on the less mathematical aspects of 
different procedures of selection. The important but somewhat tech- 
nical problems of the use of different kinds of estimation will be largely 
neglected except for a brief discussion in Section 22. In the Illustrative 
material the method of observation is assumed to be the interview. 
However, the procedures have general applicability to a great variety 
of other procedures, situations, and problems. ' 


FUNDAMENTALS OF SAMPLING 


1. A Simple Random Sample ■ ‘ 

Before proceeding further, let os look at an illustration of a' specific 
sample design. Suppose that it is desired to learn something about the 
attitudes of the 12,000 employees of a factory. Suppose, also, that it 
has been decided that the method of sampling is to be “simple random 
sampling” and that the size of the sample will be 400 interviews. ^ 

How do Wc select a “simple random sample” of 400 out of 12,000 

t At this point, tome may think the widely used sampling procedure whereby 
every ith {here every 30th) element is selected into the sample. That distmet 
method of lampliog, involving the use of intervals of selection, we call “systematic 
sampling*' and discuss in Section 10. 
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employees’ \\e obtain a (payroll) list numbered from 1 to 12,000 on 
which each of the 12,000 employees appears once From a table of 
random numbers, 400 different! numbers are drawn (4 p 34) At each 
draw a five-digit number not greater than 12,000 is taken These 400 
numbers wi'l designate the numbers of 400 employees Their names 
and other necessary identification are obtained and given to inter- 
viewers They are interviewed, and their answers are reduced to a 

numerical code , , 

No\n \vc \Nant to estimate the proportion of all the employees in 
the factory who hold a certain altitude, that is, we shall calculate a 
sample value to serve as an estimate of the population v^uc For 
simple random sampling, the sample mean is obtained by t c simp e 
and familiar procedure of adding up all the values of the sample cases 
and dividing by their number n In symbois * 


A proportion ^ IS only a special kind of mean (J) ^^ere all th^ 
elements which possess a specified attribute arc eno y ^ 
of ir = 1 and all other Nd -/>) elements are aligned the va^ue 
;r = 0 The sample mean is obtained by dividing the 
sample (r) possessing the attribute by the total number in the sample 
r^p' l"/aTr example, let us say that m the 400 t-tenoews them 

were 80 “yes” answers to a question Thcsample proportmnp - 8^ 

400 = 20 IS our estimate of/., the proporlton that would hja 
obtained had the entire 1”?“'^"'’" "maf/onhe ag^e^erof the Mat 

"%he ’s^pL mean obfained from --B’' 

many possible values that could have ° value “ to 

sampling variability We want J,he^vanability to which 

be able o do this, wc must have a m 

IBy the word dif-rmi we 

time we dirregerd it Thus, at eaeh cho ec we are leiecims 

element* . the *ample design was simple 

IThe subscript e u used from othe/ designs will abo be 

random sampling Similarly estima 
designated by specific subscripts 
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are possible and what is the probability of occurrence of each of those 
values’ The array of possible values of if', each with its probability of 
occurrence, is a distribution; it is the sampling distribution of the 
possible sample means 

In the course of empirical research, when we calculate the mean 
from the single available sample, we cannot obtain the actual values 
for all the sample means possible under the design. Nevertheless, the 
distribution of all possible sample means is an important theoretical 
concept. It is what we should have in mind when we think about 
sampling fluctuations, sampling errors, standard errors, and such. 
Hence, suppose that there has been specified a sampling design to be 
applied to a given population— including the size of the sample, the 
method of selection, and »he method of estimation. Now imagine that 
all the sample means (i,') possible under the design arc calculated; 
and so is tHe probability (/*<) of occurrence of each of those sample 
means * This probability is analogous to the “relative frequency” in 
the calculations of the common formulas for variance Now we have 
the sampling distribution of the different values that the sample mean 
I,' may take. The mean of that distribution— r e , the mean of the pos- 
sible sample means— is denoted by s', and in well-designed samples 
jf' is cither equal to the parameter X or is close to it. The serious 
problems of the differences between x' and X which arise in practice 
arc discussed briefly in Section 16 Until then, we shall ignore the 
differences between the parameter X and the mean of the sampling 
distribution Jt'.* 

1S«e (18 p 2-4. 4 p 37 43. and 16 p 95-103). 

Slmagitic that, afwi the sample design 'was tpcciricd, sample afler sample was 
drawn The mean for each sample was calculated and these values were tabulated. 
After many samples were drawn, the form of this distribution would gradually 
become more stable As the number of samples drawn increased, the shape of the 
distribution would approach the true sampling distribution In Action 3 we men- 
tion that for most large samples that distribution is close to a normal distribution. 

• These differences exut, however, and are of great practical importance, 
owing to the presence of the nonsampling errors of response and nonresponse 
Hence, m this chapter the statements of statistical inference arc made not to the 
parameters but instead to “population values ” These are the values that would 
have been obtained if the entire population had been designated for observation 
" ' ra**'*'a^an only a sample (Section 16) They are subject to the same sources of 
> ' *0*^ nonresponse to which the sample estimates Xi arc subject, 

' . equal to the mean of the distrihi.tinn nf fhi- 
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est se of X,' = 983 X = 983 X 0200 = 0197 

The estimated standard error of the aggregate Nx' can be given as 
est s e cf (Nx') = N (est s c of X') 

Hence, the standard error of the estimated 2400 employees who would 
have said “yes” to the question is 12,000 X 0197 = 236 employees 
The simple random sample as described here is seldom used in 
practice It occupies a central place in sampling theory because it 
serves as a standard of comparison and because it is the basis for the 
various modifications of more complicated designs superimposed on it 
In addition, its treatment here is justified by the fact that most of the 
formulas used in introductory texts, with which the reader is assumed 
to be acquainted, refer to samples obtained by simple random sampling 
The reader is warned that those formulas can not be used validly in 
connection with samples obtained from other kinds of sample designs 

2 The Sampling Distribution of the Fstimate 

The information that the mean of a specific sample is 20 has no 
intrinsic practical worth, its value lies in that it tells us something 
about the population mean X Once we know that a sample mean* x' 
IS 20, just \\ hat can we say about X’ We know that the unknown X will 
differ from our known x' *= 20, but we do not know by how much 
The sample mean will depend on which sample of 400 persons hap- 
pened to be selected, since different samples of 400 persons would give 
different sample means The deviation from the population mean of 
any single sample mean is unknown, it may be large or small, plus 
or minus The only fruitful way of looking at sampling variability is in 
terms of uhai size deviations arc likely to occur in the long run That 
IS, we ask gi\en a sample design \ihat values of the sample mean x' 

•The discussion in this section concerns sample means derived from any kind 
of sample design Here X without any subscript denotes a sample mean m general, 
without spcafying the sample design The symbol Z,' is used to denote any one of 
the values in the distribution of all sample means which may be obtained with the 
specific design {Section 16) The sample mean’ here denotes the sample estimate 
of the population mean It does not necessarily refer to the simple mean of the 
sample cases in Sections 8 11 and 22 other formulas arc gi\\n for the means of 
some other designs Tlie discussion is relevant for the sampling distribution of 
latutici oil cr than the mean 
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X,' is subject (Sections 2 and 3). The standard error, the square root 
of the variance, of the sample mean is the usual measure of that 
variability. For simple random sampling, the variance of the sample 
mean can be estimated from the sample as: 

est. var. of X/ = (1 —J) — , where i* = — ^ 2(jt — Xjy-, 

7t n *“ 1 

that is, the variance of the sample cases, with {n — I) used as divisor. 
(The statistic j’ is a sample estimate of the variance of the elements in 
the population.) 

This formula holds also, of course, in the event that our mean is 
a proportion. However, in that event there is a form which saves the 
labor of squaring: 

cst. var. of/i' = (1 

It may he noted that here p'{l — p') takes the place of its 

equivalent j^. 

These formulas will appear familiar to the reader with the possible 
exception of the factor (1 — /) which for a simple random sample is 
(1 — /) = (1 — n/N). This ‘‘correction for a finite population’* arises 
when sampling from a finite population “without replacement.” This 
last phrase refers to the procedure which prevents any element from 
being selected into the sample more than once. (We did this by 
selecting 400 different random numbers, not allowing the same number 
to appear twice.) In most practical cases the sampling fraction (n/iV) 
is so small that this correction is of no importance. In reading the 
formulas, the reader should slip past it and on to the important parts 
of the formulas. 

In the present case the sampling fraction is n/iV * 400/12,000 « 
1/30. Then (1 -/) = l - 1/30 = 29/30 » .967. On the standard 
error, the effect is still less, being equal to I ~ 1 /30 which is closely 
equal to 1 — 1/2 • 1/30 = .983. In most cases this factor is so close 
to unity that multiplying by it has no appreciable effect. In our 
illustration we have for p’ = .20 ' 

est. var. of I,' = .967 = .967 X .000401 - .000388 

Its square root is .0197. Or, again, 
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The deviation of each possible sample is taken from the mean of 
the sampling distribution x', These deviations may be denoted as 
(J.' _ Ig'y Then the variance of x' around x' is calculated by taking 
the sum of the squared deviations each multiplied by its probability- 
Us “relative frequency”— of occurrence: 

var (X') = S P. (X.' - IT 


(The summation is for all possible samples.) 

The standard error of the sample mean x' is defined as the square 
root of this variance. In other words, the sampling distnbutmn repre- 
sents the random fluctuation, the variability of the sample mean, due 
to a specified sample design. The amount of variability is measured m 
terms of the standard deviation of the sampling distnbution. This very 
important quantity is called the standard error ° ™'t"avs 

and we shall abbreviate it as s.e. of x'. Remember that i. is always 
defined as the standard deviation of the distribution of all sample 
means under a specified sample design. . 

Now, as weLve said above, it is not possible •" 
obtain the standard error directly from the actual 
tribution. However, through the use of the concept 
distribution, mathematical statisticians have ev P' means for 
formulas fo; the variances and standard errors of nrfe 

many practical sample designs. Moreover, they have developed pra^ 
deal formulas for obtaining «nva/» of the ^ 

errors of sample means-estimates which can be 

data of a single sample. One example is the formula given for a simple 
random sample: f 

e'st. s.e. (X/) = Vl - / 

7-/,r ierir dc/n.Von oj Ar sUmdard error of 

■ompfriferi^n. But it must be emphasized tha error- some of 

them are different formulas of the estimated f 

these Will be given later f c^nfi" 

designs. They serve as powerful tools in the lorm 
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are possible and what is the probability of occurrence of each of those 
values? The array of possible values of each with its probability of 
occurrence, is a distribution; it is the sampling distribution of the 
possible sample means x'."! 

In the course of empirical research, when we calculate the mean 
from the single available sample, we cannot obtain the actual values 
for all the sample means possible under the design. Nevertheless, the 
distribution of all possible sample means is an important theoretical 
concept. It is what we should have in mind when we think about 
sampling fluctuations, sampling errors, standard errors, and such. 
Hence, suppose that there has been specified a sampling design to be 
applied to a given population— including the size of the sample, the 
method of selection, and »he met^»od of estimation. Now imagine that 
all the sample means (ie/) possible under the design are calculated; 
and so is the probability (P.) of occurrence of each of those sample 
means * This probability is analogous to the “relative frequency” in 
the calculations of the common formulas for variance. Now we have 
the sampling distribution of the different values that the sample mean 
if,' may take The mean of that distribution— i e , the mean of the pos* 
sible sample means— is denoted by S', and in wclhdcsigned samples 
J!' is either equal to the parameter J or is close to it. The serious 
problems of the differences between s' and S which arise m practice 
arc discussed briefly in Section 16. Until ihcn, wc shall ignore the 
differences between the parameter S and ihc mean of the sampling 
disinbution j?',* 

tSec (18 p 2-4, 4 p 37-43. and 16 p 95-103) 

^Imagine that, alter the sample design was specified, sample after sample was 
drawn The mean for each sample was calculated and these values were tabulated. 
After many samples were drav>n, the form of this dutribuUon would gradually 
become more stable As the number of samples drawn increased, the shape of the 
distribution would approach the true sampling distribution In Section 3 we mcn- 
uon that for most large samples that distribution is close to a normal distribution* 
differences exist, however, and arc of great practical importance, 
ovsmg to the presence of the nonsamphng errors of rcspwnse and nonresponse 
Hence, m this chapter the statements of statuiical inference are made not to the 
parameters but instead to “population values ’’ These are the values that would 
have been obtained if the entire population had been designated for observation 
rather than only a sample (Section 16). They are subject to the same sources of 
errors of response and nonresponse to which the sample estimates Xt are subject. 
They may be thought of as roughly equal to the mean X" of the distribution of the 
sample estimates. 
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(c) In the illustration in Section 1 we calculated the sample mean 
as .20 and its estimated standard error as .02. Now we make the 
statement that the population value lies in the interval between the 
value of .20 — 2 (.02) and the value of .20 + 2 (.02) ; that is, between 
16 percent and 24 percent. This statement is the result of the sample 
we happened to draw; on the basis of another selection, we might have 
said between 19 and 25 percent, or between 14 and 24 percent, etc. 
The probability that statements of this kind aic correct is 0.95. That is, 
in the long run, about 19 out of 20 statements of this kind will be 
correct. Thus, there are 5 chances in 100 that the statement would 
turn out to be incorrect; that, if we obtained the population value, we 
would find that it lies outside the limits we set (4, pp. 73-74). 

In general, we make the statement that the population mean S is 
somewhere between the value of je' + 2 (cst. s.e. of and that state- 
ment will have a 95 percent probability of being correct. (This assertion 
assumes that a valid estimate of the standard error was obtained, and 
that the approximation to normality of the sampling distribution is 
good enough.) We may choose our confidence intervals a$ X' ± t (cst. 
s.e. of X')\ the larger t is, the greater the probability that the statement 
is correct. Here t is the ‘^normal deviate,** which may be found in 
tables in many statistics textbooks. 

Some values of t and the corresponding values of the probability of 
making correct statements, are: 

I .67 1.00 1.96 2.58 3.00 4.00 

P .50 .68 .95 .99 .997 .99994 

The probability that we tell the truth increases rapidly with an 
increase in the value of /; that is, with an increase in the length of the 
confidence interval. However, the longer the confidence interval, the 
less useful it is. It is general practice to fix the level of probability at 
some point and then use the corresponding / to get the length of the 
interval. In social science frequently the 95-percent level of probability 
is used, corresponding to a value of / — 1 .96. In this chapter, the value 
of 2 is used as an approximation to 1.96. 

Our aim is to reduce the length of the confidence interval without 
decreasing the probability of making truthful statements. The reduc- 
tion of the standard error of sample results is the goal of sample design, 
discussed in Sections 7 and 14. 
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3 Confidence Intervals 

With regard to the use of confidence intervals for sample estimates, 
iome understanding of several important points is necessary 

(a) From the results of well>planned probability samples, it is 
possible to compute values for the estimated standard errors of sample 
means Mathematical statisticians have denved formulas from which 
we can compute those values or some useful practical approximations 
The standard error is defined as the standard deviation of the distribu- 
tion of the sample means, and the distribution depends on the specific 
sample design used Therefore, the formula for the estimated 
standard error will depend on the sample design used In the ca^ of 
simple random sampling, we have the familiar forms of j/ ■%/« or 
^/p’ (1 — p') I {n 1) But these formulas will not hold for other 
sample designs The difference may be either in the selection or in the 
estimation procedures 

(6) Many of the estimates used in practice have sampling distribu- 
tions which are approximately normally distributed Just how good 
that approximation is depends on the underlying distnbution of the 
characteristic m the population and on the size and design of the 
sample For any variable encountered in practice, the approximation 
improves with the size of the sample 

In case one has a small sample, or some other reason for doubts, 
he should ascertain whether he may proceed with the assumption of 
normality If the sampling distribution departs seriously from the 
normal distribution, two kinds of alternatives are open for the construc- 
iion of confidence intervals A search may be made for some distribu- 
tion other than the normal, to serve as a useful approximation Or he 
may try to make use of a ‘ distnbution-frce” statistic (see Chap 12). 

For many sample results encountered in practical social research 
work, the assumption of normality will lead to errors which are small 
compared to other sources of inaccuracy which are tolerated The 
assumption of normality leads to simple statements of probability 
through the use of confidence intervals If, as below, we use tables of 
the normal disinbution for making those probability statements, we 
thereby assume that the normal distnbution is a good approximation 
to the sampling distribution of our estimates In so far as that assump- 
tion IS not justified, our statements will have a probability of being 
wrong different from that which we intended and stated 
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congruent with the probability model that underlies our statistical 
theory. Terms such as “tossing perfect coins” or “drawing perfect balls 
from perfect urns after complete mixing’*-*terms found in textbooks— 
have the aim of providing an intuitive grasp of the kind of necessary 
physical process. 

However, there are practical reasons why we must make two 
modifications in this simple picture. First, there are serious objections 
to tossing, mixing, and “drawing” the kind of elements with which we 
want to deal— the elements themselves might object. This problem is 
met by listing and numbering the elements, and then mixing and 
drawing uniform objects bearing their identification numbers. That is 
what is supposed to go on in a bingo cage, or in a lottery. Secondly, it is 
difficult to construct perfect coins, perfect balls, or perfect urns, or to 
bring about complete mixing. TTie equivalent of that process has been 
performed by careful experts who gave us their results in “tables of 
random numbers”— our convenient equivalent of complete mixing of 
perfect balls. Thus, we see that the mechanical process of selection, 
which is indispensable for probability sampling, is accomplished by the 
following chain: A set of numbers selected properly from a table of 
random numben identifies a set of numbers on a listing of individual 
sampling units; from these selected units the identification is made to a 
set of physical units which will comprise the sample. 

Sometimes there are practical problems involved in the identifica- 
tion of the individuals associated, with the selected numbers on thp 
listing. These arc problems of field procedure which must be solved 
with clear, simple, and practical instructions, and not merely assumed. 
Sometimes, as in the case of identifying employees from a payroll list, 
the task may hr simple and easy. At piber times, as in idemifymg 
dwelling units from a block listing sheet, the process may call for skill, 
and mistakes may occur (sec Section 15). 

Furthermore, there arc some criteria which a list must fulfill, and 
the making of a satisfactory list involves various problems. First, a list 
may Ijc incomplete; for example, the list of payroll cards of the factory 
may exclude white-collar employees or the employees hired since some 
recent date- It may be decided to exclude these explicitly from the 
population. Alternately, one may establish a separate stratum for them 
*o that they will obtain the proper probability of selection through 
separate sampling procedures. 



184 Procedures for Sampimg 


4 MeamabiUty The Need for Probability Samples 

The use of confidence intervals is based on statistical theory The 
application of this theory can be developed only for those samples in 
which the probability of selection of every element of the population is 
known There is a gulf between the known sample result and the 
unknown population value , the confidence interval is the only objective 
statistical bndge across that gulf Confidence intervals are based on 
the proper estimates of the standard error But standard errors may be 
calculated validly only for probability samples— that is, for samples for 
which the probability of selection of every clement in the population 
is known * Therefore, if we wish to make use of the theory of statistical 
inference, we must use a probability sample 

The property of a sample which enables the researcher to make 
estimates based on sample data of population values and then to 
calculate confidence intervals for those esiimates has been called 
measurability It is desirable to consult a sampling statistician with the 
plans bejm the survey to see whether the design will allow the valid 
calculation of the precision of the sample There may be other ways of 
judging the adequacy of samples, but they depend on personal judg- 
ment If a nonprobabihty sample is taken, such as a “quota sample” or a 
“typical” city, the results may be good or they may be poor But 
statistical theory is lacking for determining the accuracy of the results. 
There may be occasions when for a small informal sample one can 
afford to dispense both with precision and with its measure However, 
this chapter will be confined to the problems of probability sampling 

5 Mechanical Selection and the Use oj Listings 

How does one select a probability sample^ Whenever a unit is 
selected into the sample Ccowv among other omts, the selection must 
be made by some mechanical procedure which guarantees the desired 
probabilities of selection to all the units involved We need some 
physical operation, some practical procedure, which will be reasonably 

RccenUy the term random ramphng has been used by some authors syn- 
onymously with prebaktlitj sampling In Um chapter, when the phrase random theut 
b '^4, we shall mean a process of selection with equal probability among the 
defined group of sampling units 
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in the various phases of area sampling, presented in Sections 19, 20, 
and 21. Of particular influence is the size of the sampling unit used in 
the listing which results in clustering, sUbsampling, and multistage 
sampling (3, pp. 76-87; 22, pp. 60-80). 

Area sampling is an important kind of listing procedure because 
it is used widely in social studies. It is also used in other types of surveys; 
for example, crops or other flora, as well as grocery stores, have been 
sampled with the use of area segments. Its widespread use in social 
surveys is due chiefly to the relative ease of identifying each member 
of a human population with one, and only one, dwelling unit. In turn, 
these dwelling units are identifled with area segments, also uniquely. 
Thus, a selection of the area segments yields a sample of dwellings, and 
these in turn a sample of people- It may be expected that the unique 
identifications of people with dwellings, and of these with segments, is 
troublesome and imperfect. This becomes a practical matter of doing a 
good job within available resources. In this connection one may men- 
tion the necessity for boundaries which are clear, unambiguous, and 
easily identifled in the field. 

i 

7. Precision; Variations in Sample Design 

It is our general goal to obtain as small a confidence interval as 
we can for some fixed level of probability of making correct statements. 
The smaller its confidence interval, the more useful is the sample 
estimate. For a fixed probability level, the length of the confidence 
interval depends on the standard error. For this reason the word 
brecision is often used to denote the inverse of the standard error (or 
sometimes of its square, the variance} of the sample estimate. 

The standard error of a simple random sample is s/ y/n. Within 
the limits of that design, the way to increase the precision (to reduce the 
standard error) is to increase n, the number of elements in the sample. 
The standard errors of other sample designs are different; but they all 
have the common property that to increase the precision we must take 
more of something (persons, dwellings, blocks, counties, etc.). But lo 
take more of anything costs money— and generally there is a limit to 
what the sample may cost. The question may be asked, then: For a 
given expenditure, how can we get the greatest precision? (See also 
Section 14.) 
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Secondly, some elements may appear more than once on the list 
Perhaps employees who work in more than one department will have 
more than one payroll card If these individuals are to have the same 
probability of selection as the others then one should eliminate the 
duplications through the entire list If this would be too difficult, some 
other adjustment has to be made m our procedures 

Thirdly, a list may contain other items in addition to members of 
the population Some lines may be simply blank, belonging to no 
units, and others may belong to unitt of some other and excluded 
population Let us say that our list identifies the 1 2,000 employees of the 
factory by their actual payroll numbers These are not consecutive, 
but run irregularly up to 99,999 Thus, among the five-digit numbers, 
there are 88,000 which do not belong to any of the 12,000 members of 
our population Some may be blank and some may belong to office 
employees, whom we want to exclude from the sample We simply 
draw five-digit numbers from a table of random numbers We inspect 
the list for each drawn number if it belongs to a member of our 
population, we have a selection, otherwise, we have not We continue 
this process until the designated size of the sample (n) is reached This 
gives us a simple random sample of n elements out of the designated 
population In general, the list should be inspected carefully before 
drawing to take whatever measures arc necessary for the insurance of 
the proper probabilities of selection 

6 Ltsttngi and Area Sampling 

The nature of the lists available for the selection process is an 
important consideration in the design of the sample Factors which arc 
relevant include the nature of the listed sampling units, the extent of 
co\cragc, the accuracy and completeness of the list, and the amount of 
auxiliary inlormation on the lut This last factor is useful, as we shall 
see, for stratification, for measures of size, and m the estimation process 
Those factors help to determine the nature of the sampling design and 
of ihe details of the practical selection procedures 

In many sampling situations there docs not exist a simple and 
complete hs’ing of all individuals, such as the factory payroll discussed 
in Section 5 Moreover, it may be too costly to construct one The 
important practical complications which a list may possess arc present 



Seicclion of the Sample 189 

Attached to many formulas in tcxtboohs is an important but often 
overlooked phrase which reads something like this:^ ^ 

pendent selections made with equal probability. ... o ua 
lelection is seldom given to the researcher. Furthe^ore, ■" 
social research such a selection is seldom “ rfe 

Therefore, the automatic use of those formulas which 
random slmpllng is not justified. Sometimes then -J- 
to the construction of confidence intervals w ic arc incorrect 

The result in those cases is that the estimate will lead » 

rs“rs“^ 

denote each sample design. denotes the mean of a 

from a simple random sample (Sectio )^» prop 

proportionate stratified sample (Section >• “" ^Vhat may 

One word more of caution may be appropriate nere^ r 

appear to be a slight change ^ ther’Lre, some change in 

the variance of the sample estimat . Hesicn 

procedure may introduce a serious bias into the design. 


STRATIFICATION TECHNIQUES 


8. Stratijicatian 

Stratified sampling is the P^^^n se“ tm^ 
into subpopulations, called strata, nlaced in one (and only 

each. Every sampling unit ^^““he Lmple so that the sum 

one) of the strata prior to the ^le s„atum, a 

of the strata is identical with th F J from each of those 

sample is selected from the units in t^ Finally, the separate 

samples the estimate is calculated estimate of the 

esUmates for each stratum are com . , calculation is available 

total population value. (However, “'^n in Section 9) 

fo. the means of proportionate stratified samples, given 
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The variations in sample design are attempts to answer that 
question. The search is for that sample design which is expected to 
provide the smallest standard error for a given expenditure. There is 
no design which is best under all circumstances. The best design 
depends on the physical distribution of the population, on the re- 
sources of the researcher, on the characteristics to be studied, and on 
the research objectives. i 

The various sample designs may be regarded as modifications of 
the basic simple random sample. The sampling fraction {n/N) denotes 
the probability of selection of every element in the population. In 
simple random sampling, it is achieved by making n different selections 
from among the N elements in the population. Each of the n selections 
is made separately and with equal probability among all the still 
unselected elements. There arc four kinds of modifications of this simple 
selection procedure which arc used in practical survey procedures: 

(a) Sirattficalicn involves the classification of the population of N 
elements Into subpopulations called strata; and the selection is made 
separately in each stratum (see Sections 8-11, 18, and 21). 

(i) SjisUmltc sampling involves the selection of a sequence of units, 
separated on the listing by the length of an interval, with the choice of 
one random number (see Section 10). 

(f) Clustering involves the use of groups of elements, called clusters, 
as sampling units. The clusters are usually some existing grouping of the 
population. With each selection a group of elements is selected Jointly 
into the sample. With the subsampUng of clusters one gets into the 
interesting area of multistage sampling. By all odds clustering is the 
most important kind of modification in terms of the magnitude of its 
effects on practical social research (sec Sections 12-14 and 17-21). 

{d) Varying probabilities may be given to different sampling units. 
The use of different probabilities for units of different strata is discussed 
in Section 11, The use of selection with “probabilities proportional to 
measures of size,” discussed in Section 20, is an important practical 
procedure. 

It should be noted that these modifications arc not mutually 
exclusive. Combinations of them can be, and arc, widely used (Section 
21). Furthermore, to this variety of possible selection procedures can be 
added a variety of estimation procedures to produce a still richer 
variety of possible sample designs (Section 22). 
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Now we calculate: 

I,' = .6 X .20 + .3 X .35 + .1 X .45 = .27. 

That is, 27 percent is the combined sample estimate for the proportion 
who would have answered "yes” to that question if all the employees 
of the company had been questioned. 

The estimated variance of that sample result is. 

est. var. (*.') = (.6)» (.020)* + (-3)* (.020)* + (.1)' (.040)= = .000196. 

That is, the estimated standard error is. 

est. S.C. M - V:0OOT% = .014. 


Therefore, with the use of two standard errors the 
that the percentage of the entire 20,000 

who would have said "yes” to that atfmem 

2 (.014), that is, between 24.2 percent and 29.8 percent. Th 

has a probabUity of about 0.95 "f ,^ch of three 

Note the important imphcaoons of the abov^ 

subpopulations a “”P‘'77^?7inS ofthe strata are’unlmown 
cedurcs, and the sizes of the samples standard errors 

and irr;velant. We have the sample means ”7^7;, 

as they were calculated separately O®” ' nonulation represented 

desired to make an estimate for the relative number of 

by the sum of the three subp^pulaUG ( • stratum; with 

dements in each of die thie.smata«^^ 

the use of these weights, the separate _„„„i.,inn 
combined to obtain estimatra lor the entire pop sample: 

There are three kinds of reasom for - 7 ® ‘TiLsr of 

(а) Stratification may be aimed Th-rebv greater precision 

the sample results for the T^Vconstant goal of sample design. 

« obtained for the sample estimates, ^ Section 11 

Section 9 discusses one method of tb^' S 

discusses contrasting methods (1. pp. * ' „rssarv ta ust diffatn! 

(б) It may be thought ra“J^e7opulation. For 

Impling methods or procedures in diffm ; a sample of mdi- 

example, in one of the factories ^ ,mnle mav be selected in 
viduals may be sdected; in another, the sample may oe 



190 Proeedurei for Sampling 

The process of selection in each stratum is carried out separately 
and independently. In each of the strata one may use a different 
sampling fraction, and even different methods and procedures. Regard- 
less of the procedures u.sed within the different strata, the strata means 
may be combined to form an estimate for the population thus: 

R 

Sk. 

That is, the sample mean Xk obtained separately for each stratum 
is multiplied by the weight of that stratum; then these products are 
summed over the R strata to obtain the combined weighted estimate 
for the entire population. The weight w\ of each stratum is the 
proportion of the total population contained in the stratum. The sum 

A 

of these weights is 1; that is, Su/k — 1. 

The variance of the combined weighted mean will be the sum of 
the variances of the individual strata means, each multiplied by the 
square of the stratum weight: 

A 

cst. var. {Sj) « 2wk’ (est. var. (*/)]• 

As an example, imagine that the factory mentioned in Section 2 
is only one of three belonging to a company and that a separate survey 
of attitudes was conducted in each. One of the questions appeared in 
each of the three surveys and it is now desired to estimate what percent 
of all employees of the company would have said “yes” to this question 


Stratum (factory) number (A) 

1 

2 

3 Company total 

Number of employees in stratum 

12,000 

6,000 

2,000 

20.000 

Relative proportion of em- 
ployees (u» » 

06 

03 

0.1 

.0 

Sample mean (the proportion 
of “yes” answen) for stratum 

.20 

.35 

.45 


Estimated standard error of 
sample mean uf stratum 
lest. 4.C. (jOI 

.020 

.020 

.040 
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However, the mformot.on used for 
be nesther object.ve, accurate, nor 

judgment can often be used pro ^ often misunderstood 

The role of stratification in ^ot if stratification 

and exaggerated aspects of the select 

IS used wc may shut our eyes to o ^ nonorobabiUty methods of 

This implication is used to justi y t e establish stratifica- 

selection,suchas“quota-Yani^« design has no 

tion as a sufficient condition f ^ subpopulation, and the 

basis in statistical theory Eac s r aophed to the selection 

principles of probability sampling must be applied 

procedures used within the strata ^ ,5 a necessary condition, 

Sometimes it is implied 'hat stratitotion is a 
a “must,” for an adequate samp c ^ relatively dull variables 

the sampler usually has to be f '"^„"nng variables (such as 
(such as age, sex, etc) “rhumTof 'he 'nd-dual) arc 

basic psychological make up stratification is 

not available In many practical caws 8^^ but a little more 
little, that IS, the same precision may ^ ^,5)555, stratification is 

expense without the use ^“‘'b«ause it is generally beneficial 
used in most sampling underta gs 
and because it is easy to app y 


9 AP,cporl,omUS<,mpUcfEI^^’’l^ , .f elements is often in the back 

Proportionate stratified ‘"..Representative sampling, when 

of people’s minds when they population must be proper y 

they insist that the “different re«f» sM,cn oj 

represented ” Let us describe a^ ^ that wc want a sample o 

dements by means of an ,, gOO employed in a factory 

a = 400 Lployees out of 'h' ^ " l^X^ces m the attitudes to be 
suspect that there may be impor different departments ere 

meLred among the employees m *e di „„,a 

fore, the employees arc listc “ divided 

(departments) into which 'h' ““"P ,f,c selection must be earned 
"in order .0 have a ““ftdection est elements became ‘he 

out separately m each ! ,odividually, separately In o 

elements (employees) arc se c number of elements 

make it a proportionate samp c, 
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clusters of work sections These differences in procedures may be m 
accord with differences m the physical distribution of the population 
elements or with differences in the manner in which they are listed, 
or they may be due to differences in total survey objectives within the 
different strata 

(c) Sometimes strata are established because the subpopulations 
are also designated "^domaxns" of study>-that is, the survey is designed to 
provide sample results of some desired precision about the several 
subpopulations separately, as well as about the combined whole In 
our illustration, the sample results (jf*') were desired for each of the 
three factories, the domains of that study, and the desired precision 
(s e of Xk) was used to determine the design and the size of the sample 
to be taken in each 

What charactcnstics of the sampling units of a population should 
be used for stratifying the units in order to reduce the variances of the 
sample estimates for that population’ 

(a) The stratifying characteristics should be related to the vanablcs 
to be estimated from the study The sorting of the sampling units on 
the basis of the stratifying characteristics should establish strata which 
will turn out to be (rdatively) well sorted with regard to the variables 
to be studied The reduction of the vanances of the sample results is 
achieved in so far as the variation (of the characteristics studied) 
among the sampling umts within the strata is less than their vanation 
throughout the population Hence, in stratifying, one attempts to 
make the sampling umts within the stratum as homogeneous as 
possible 

(i) In most cases it is wasteful to spend much time worrying about 
just which vanablcs would be most suitable Expenence shows that 
usually there is not much difference in the precision of two procedures 
if both arc based on some reasonably good stratifying variables A 
person acquainted with the subject matter will usually hit readily on a 
reasonable choice An elaborate and expensive further search may not 
bnng commensurate gains 

(e) Each sampling unit must be assigned to one of the strata * 

tif the informatUQ is not available for tome of the sampling units, a “miscel- 
laneous stratum u arvised Sometimes “double samplmg” is used to obtam in- 
formation cheaply in the first phase for itraufymg a lai^ sample, then, m the 
second phase, a smaller sample b tubselected for the mam p*rt of the study (1, p 
258, 22, p 38, 16, p 153) r v • k 
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Foundry 



Erdirt 



and 



Assembly 

machines 

Opce 

Other 

factory 

Stratum number h 



3 

4 

Total 

1 

2 




Stratum weight tf* 

.333 

.250 

.250 

.167 

1.00 

Number of employees selected 
from stratum n* 

133 

100 

100 

hi 

400 

Number of "yes’* answers in 
stratum X^' 

12 

11 

36 

21 

80 

Proportion of "yes’* answers 







.09 

.11 

.36 

.31 



Because the sample is self-weighting, number of 
be taken simply as the total for the sample divided by 
cases in the sample:* 


* i- Xx. 


T . • I e an 7400 * .20. The subscript prop 

In our example, we have vvith which it was 

of the sample mean identifies the P calculating the sample 

obtained. The term ^..hout any special weighting 

n:ean, the sample cases arc simply ad 

'"u-we-had a stratiHed sample which P^P'lnt tea':: 

sample would not be self-weigntin^ Ktmns procedure: 

the mean would have to be obtained y 

V = SmiV = . 7S0 X 36 -f -167 X -31 = 0-20. 

.333 X .09 -I- .250 X -H + ^ ’ 

, 5 . ' The mean for the jtratum U I.' - — 

*As given in Section 8, X» *“ XwkXh • » i ■» 1 * 

In » . « ^ ne/jt. Therefore, .*pr«t» n «» ” . 

a proportionate sample, iw* *“ «»/"• r, , Ac sum of all values m each 

the double summation means simpty added for all R strata to oh ^ n 

U obtained, these partial sums must jnay express 

sample total. This summation can be none » 
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Foundry 

and Entire 



Symbol 

Assembly machtne 

Office 

Others 

factory 

Stratum number 

k 

\ 

2 

3 

4 

Total 

Number of employees 

Nk 

4,000 

3,000 

3,000 

2,000 

12,000 

Relative weight (Nk/N) 

Wk 

333 

2S0 

250 

167 

1 000 

Number of employees to 
^ be selected (nu^) 

r>k 

133 

100 

100 

67 

400 


Stratum in the sample must be made proportionate to the number of 
elements from each stratum in the population The numbers of elements 
m each stratum relative to the population total (N) is denoted by the 
stratum weight = Nk/N Now if we multiply the total desired size 
of the sample (n) by the weight of the stratum, we obtain the number of 
elements to be selected in each of the strata n* = nwk (as shown on 
the last line) Thus the sample will be proportionate because the 
representation of each stratum m the sample is equal to the ratio of 
that stratum in the population nk/n==Nk/N Forexample. 133/400 * 
4000/12,000 » 333 » / 


Another way of regarding a proportionate sample is that the 
sampling fraction in each stratum is equal to the sampling fraction for 
the ^pulation as a whole n*/Ar, = „/N That is, the sampling fraction 
*= 100/3000 = 67/2000 (There is 
a sma ifTercnce due to the unprccisc fraction in this case, as there is 
usual y in real cases This is ordmanly a tnvial matter ) That is, the 
sampling fraction n/N = 400/12,000 = 1/30 is obtained, then this 
traction is applied to the numbers of elements {Nk) in each of the strata 
\ ** niorc word to be explained in our definition of the 

Mmple design random By random we shall mean that the selection of 
the n* elements out of the Nk in each stratum is to be made by a 
separate random choice with equal probabilities among all the elc 
ments, just as defined m Section 1 In stratum 1, for example, 133 
difTcrem random numbers from 1 to 4000 must be taken from a table 


•ecuoii^STJ!St''m2a “ <l<a=nbcd in ihn next 
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department; then from the employees in the department n* random 
numbers were selected. 

In practice a simpler device is commonly used: systematic 
sampling. This is probably the most widely known design; it consists 
of taking every kih individual after a random start from 1 to k. In this 
instance one would take a number from 1 to 30 from a table of random 
numbers. With that number as a start the interval of 30 is applied. 
If the random start was number 28, we should have the numbers 
28, 58, 88, 118 . . . 11,998 in the sample. These numbers refer to the 
consecutive numbers of the employees as they appear on the list. The 
estimate of the mean here, as in a proportionate sample, is the usual 

simple mean of the sample; s' = “Sx-. However, the estimation of 

the variance of the mean is not clean cut (1, pp. 179-182). 

If the payroll cards of each department had been shuffled thor- 
oughly before they were ordered on the list, the systematic sample 
would be equivalent to a proportionate stratified random sample. For 
the latter, the shuffling process is not necessary because the procedure 
of riK separate Independent choices in each stratum accomplishes the 
equivalent of a shuffling process. With the regular intervals of sys- 
tematic selection this shuffling is lacking. However, in many practical 
instances the haphazard arrangements are considered to give results 
similar to random selection within the strata; in those cases the formula 
of stratified sampling would be used: 

est. var. (je,') = Zo/J — . ' 

n* 

Nevertheless, we should be wary. In stratified random selection 
the arrangements of the units within the strata may be ignored because 
the random selection will provide the necessary shuffling. With sys- 
tematic sampling there is need for reasonable reassurance that the 
arrangements of the sampling units within strata may be regarded as 
if they were random. There arc schemes for using several different 
random starts rather than just one. 

The researcher should be alert for two kinds of departures from 
randomness in the arrangement of the population units. If cither of 
these situations exists, or maj> exist, a systematic sample should be 
avoided or modified. 
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simple random sample would obtain some number between 1 24 and 
142 two thirds of the time, and between 114 and 152 in 19 samples 
out of 20. 

(c) We shall see in Section 11 that in some cases bigger gains may 
be obtained from allocations of the sample which are not proportionate. 
And in Section 22 an example is given to show that the gains of 
proportionate sampling may be obtained without stratified selection, 
by weighting the results in the separate strata so that in the combined 
sample result the various strata arc properly represented. 

{d) Although the gains from it may be small, proportionate 
stratified sampling is very widely used. One reason is that it is'a safe 
thing to do: the precision cannot be worse than if the sample is drawn 
without stratification, and often it is better. Secondly, it is an easy 
thing to do; quite often, as in our factory example, it can be done with 
little or no efi’ort. Thirdly, because the sample is self-weighting, the 
calculations arc simpler than with the use of either of the two methods 
mentioned in the preceding paragraph. 

(<) The gains arising from stratification are often greater when the 
sampling units arc clusters than for the selection of elements. Examples 
of stratified samples of dusters are discussed in Sections 18-21. 

(/) For a proportionate sample of elements, one should not waste 
much time considering the exact variables to use for stratification. In 
most cases the researcher can choose readily the useful stratifying 
variables, in so far as they arc available. In our example of the factory, 
the departments suggested themselves. Perhaps sex, age, work sections, 
or job classification could have been used instead of the departments or 
as additional strata. ' 


10. A SjislmatU Sample 

In the taking of a sample of the employees of a factory (students of 
a school or members of a club, etc.) a proportionate sample of indi- 
viduals would be a likely choice for sample design. Let us assume that 
the payroll has been listed by departments and that within the depart- 
ments the names are arranged in a haphazard order or perhaps 
alphabetically. We calculate that we want every employee to have a 
400 -i- 12,000 “ 1 in 30 chance of selection. In Section 9 we went on 
to designate the numbers bj » b(N*/JV) that were to be selected in each 
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formula which will be a useful approximation in these cases is one 
which is based on the successive squared differences among the n 
selected elements in their order of selection (1, p. 180; 22, p. 229): 

est. var. = (1 -/) ^ 2 (x, - x,+0^. 

In Sections 12, 19, 20, and 21, examples of systematic selections of 
clusters are given. In Section 18, a formula similar to the one above is 
discussed in application to a cluster selection. 


11. Allocation to Strata 


A method of using stratification to increase the precision of the 
sample mean Xt»* is the deliberate use of different sampling rates in the 
R V. 

various strata. The estimate xj = 2 Xj, can be made most precise 
lor a fixed cost if the sampling rate within each stratum is made 
directly proportional to the standard deviation within the stratum and 
Inversely proportional to the square root of the cost per element in 
that stratum. That is, for a minimum standard error of X»\ often called 

optimum allocation,” make ^ proportional to 


Kerc Jn is the cost per element, so that Sn^J^ is the fixed total cost 
related to the number of interviews (1, p. 73). 

Several points may be noted: 

(а) If the cost per interview is the same in each stratum, the 
problem becomes one of allocation of a fixed number (n) of interviews 
to the various strata. In that case the sampling rate in each stratum 
should be made proportional to standard deviation within the stratum. 

(б) This “optimum allocation” of the sample may also be yicw^ 

^ that which yields a desired variance of the cstimat mean or 
least cost (see Section 14). „ 

W Of ft and Jh only rough otimatM arc ,, lY" 

precision is not needed here; convenient rates I 

proportional to those quantities ordinarily suffice. c i cr 
precision is smaU between an optimum allocation and another whiet. 
b only roughly like it (3, p. 366). 
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(j) A trtnd Imagine that, unknown to us, somebody sorted all the 
3000 employees m the foundry according to increasing seniority, and 
that is the order m which they appear on the list from which we shall 
select the sample Then we are to select our systematic sample of every 
30th after a random start One of the 30 possible samples would select 
the employees numbered 1, 31, 61 2971, another sample would 

consist of numbers 30, 60, 90 3000 In the latter possible sample 

each employee has more seniority than his counterpart in the former 
by 29 ranks The means of these two samples would be widely divergent 
as regards seniority and other vanables strongly associated with it 
Hence, there is a great deal of vanability among the 30 possible 
samples, results on any characteristic strongly related to seniority 
would be dependent on which of the 30 possible samples happened to 
be chosen 

(6) A eydtcal Jlueiualton Imagine that one of the departments is 
composed of work sections each of which contains 10 employees and 
that somebody has arranged the listing so that the 10 employees of 
each section are together and m the order of their seniority Thus we 
sec a cyclical fluctuation of seniority with a “period” of 10 employees 
Now if a sample of every 30th employee is selected, there are 30 
possible samples Three of these samples (those with the random starts 
of 1, 11, or 21) would select the roost senior employee in each group of 
10, three others the most junior, and the other samples would fluctuate 
similarly Again, we sec a great deal of variability depending on which 
among the 30 possible samples happens to be selected (1 » pp 1 60-1 74) 
In addition to an unduly great amount of variability, there is 
another serious objection to a systematic sample under these cir- 
cumstances The calculations made from the sample will not show this 
source of vanability Thus the true standard error may be grossly 
underestimated from the sample data 

Nevertheless, systematic sampling is used very widely Because of 
the irregular arrangement of so many kinds of populations, the formula 
for stratified random sampling is frequently an adequate approxima- 
tion, as we noted above However, suppose that within each department 
of our illustration there is an arrangement (by work sections, for 
example) which makes for stratification within each department That 
stratification would not be rcncctcd in the formula based on the 
assumption of simple random selection within the department A 
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natural boundaries so that every dwelling is located clearly inside one 
of the blocks. The blocks are numbered consecutively; this numbering 
establishes a list of the blocks, the numbers going from 1 to 750. The 
number of dwellings per block is variable, but the average is 12,000 -r- 
750 ~ 16. Now a sample is chosen by selecting in a random manner 25 
blocks out of 750 and including in the sample all dwelling units found 
within the boundaries of each of the 25 sample blocks. Note that the 
probability of selection of any dwelling in the city is 25/750 = 1 /30. 

In this example the elements arc dwelling units, since the analysis 
will be in terms of characteristics of dwelling units. The sampling unit, 
however, is the block; the selection is made from a complete list of all 
the sampling units in the population— that is, the numbered list of 
blocks. Each element belongs to one, and only one, of these sampling 
units. The selection of each sampling unit results in the selection of the 


cluster of elements which it contains. ... 

Cluster sampling is the name given to methods of selection in 
which the sampling unit, the unit of selection, contains more than one 
population element; the sampling unit is a cluster of elements. In our 
Ulustration the block is the cluster, composed of dwelling units as 
elements. But just what is a cluster or an element is only. ° 

practical expediency. In some studies the dwelling unit will be n^ardrf 
as a cluster of persons, whereas in another study the population ele- 
ments might be blocks or even eilies. The same physical 
the people of the United States, may be regarded in turn as 
of units which are states or counties or cities, towns, and townships or 
blocks or dwellings or, finally, the "dividual persons. The elemen 
the population are defined in accord with study o jectives. 
are defined in conformity with the requirements of a 
economical sampling design applied to the physical distribution 

population (3, pp. 135-146). question arises 

After the individual elements are » . _ rontrari- 

whether they should also serve as sampling units or 
wise, it may be more economical and praetieJ to define a ^mphng 
unit whichl a cluster of the P^^^'V 

be several examples given, in later sretion , j. ,^ognition 

to the use of clusters'. The choice of eluste^ is ^ 
in the sampling procedures of some exis g . ^ factorv 

of the popltion Thus, one might study the employees 
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((0 For the reason just given, ordinarily it does not pay to resort to 
disproportionate allocation unless there are substantial differences 
among the estimates for sx or those for J* among the various strata. 
If those differences are large, the gain over proportionate sampling 
may be large. However, in the estimation of proportions, usually no 
great gain may be had through di sproportionate sampling; their 
standard deviations are (1 - />»), and *at quantity is not sensitive 
to the kind of fluctuations one usually encounters with values of ph 
between .10 and .90. 

(f) Disproportionate allocation should be used only with caution, 
perhaps only on expert advice- The optimum allocation for one item 
on a survey may result in large losses of precision (greater standard 
errors) for some other items on the survey. Furthermore, although the 
“self-weighted” prop>ortionate sample makes for easy computations, 
the calculation of the sample mean of a disproportionate sample in- 
volves weighting in inverse proportions to the sampling'rates. The 
weighted calculations of a disproportionate sample may.be costly. This 
added cost of tabulation, not included in the formulation of “optimum 
allocation” above, should be considered before a sample design with 
disproportionate sampling rates is adopted. 


CLUSTERING 


12. Cluster Sampling 

In a factory many instances would arise that would call for a 
cluster sample. It nught happen, for example, that employees would be 
selected not individually but in clusters of work sections. Let us move 
on, however, to another illustration. Let us say that we want a sample 
of 400 out of the estimate 12,000 dwellings of a city, with equal 
probability of selection for each dwelling. Were a list of the city’s 
dwellings available, the procedures of sampling discussed in Sections 1, 
10, and 11 could be used here, too. Suppose, however, that such a list 
is not available and it is deemed too costly to prepare. Suppose, 
furthermore, that it is desired to economize on the costs of locating 
dwellings by means of sampling entire blocks. The entire area of the 
city’s map is divided into blocks along identifiable streets and other 
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natural boundaries so that every dwelling is located clearly inside one 
of the blocks. The blocks are numbered consecutively; this numbering 
establishes a list of the blocks, the numbers going from 1 to 750. The 
number of dwellings per block is variable, but the average is 12,000 -i- 
750 = 16. Now a sample is chosen by selecting in a random manner 25 
blocks out of 750 and including in the sample all dwelling units found 
within the boundaries of each of the 25 sample blocks. Note that the 
probability of selection of any dwelling in the city is 25 /750 1 /30. 

In this example the elements are dwelling units since the analysis 
'Vill be in terms of characteristics of dwelling units. The sampling unit 
however, is the block; the selection is made from a complete l«t of al 
the sampling units in the population-that is, the numbered list of 
blocks. Each element belongs to one, and only one of samph 
units. The selection of each sampling unit results in the selection of the 

duster of elements which it contmns_^ ,o methods of selection in 

Cluster sampling is of selection, contains more than one 

which the sampling unit, the unit o elements. In our 

population element; the sampling of dwelling units as 

Illustration the block is t ' ’ is only , matter of 

dements. But just what is » unit will be regarded 

practical expediency. In some stud t population ele- 

as a cluster of persons, physical population, 

ments might be blocks or even • jn mm as composed 

the people of the United States, m y towns, and townships, or 

of units which arc states or .j^yual persons. The elements of 

blocks or dwellings or, finally, t r > study objectives. The clusters 
the population are defined in f'™ mquirements of a practical and 
arc defined in conformity "•itn physical distribution of the 

economical sampling design apphrd 

population (3, pp. 135-146). defined, the question arises 

After the individual units or whether, contrari. 

whether they should also sers e practical to define a sampling 

wise, it may be more ^rpusly defined elements. There «i|| 

unit which is a cluster of tho P ,^,jonj, of situations which had 
be several examples gi«n, m i, generally a recognition 

to the use of clusters. The listing teamres m the nuthr-up 

in the sampling procedures » employea of a he, 

of the population. Thus, one nngW 
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{</) For the reason just given, ordinarily it does not pay to resort to 
disproportionate allocation unless there arc substantial differences 
among the estimates for r* or those for among the various strata. 
If those differences are large, the gain over proportionate sampling 
may be large. However, in the estimation of proportions, usually no 
great gain may be had through di sproportionate sampling; their 
standard deviations are \/pK (1 - /►*), and that quantity is not sensitive 
to the kind of fluctuations one usually encounters with values of pk 
between .10 and .90. 

(«) Disproportionate allocation should be \iscd only with caution, 
perhaps only on expert advice. The optimum allocation for one item 
on a survey may result in large losses of precision (greater standard 
errors) for some other items on the survey. Furthermore, although the 
"self-weighted” proportionate sample makes for easy computations, 
the calculation of the sample mean of a disproportionate sample in- 
volves weighting in inverse proportions to the sampHng'rates. The 
weighted calculations of a disproportionate sample may be costly. This 
added cost of tabulation, not included in the formulation of “optimum 
allocation” above, should be considered before a sample design with 
disproportionate sampling rates is adopted. 


CLUSTERING 
12. Cluster Sampling 

In a factory many instances would arise that would call for a 
cluster sample. It nught happen, for example, that employees would be 
selected not individually but in clusters of work sections. Let us move 
on, however, to another illustration. Let us say that we want a sample 
of 400 out of the estimated 12,000 dwellings of a city, with equal 
probability of selection for each dwelling. Were a list of the city’s 
dwellings available, the procedures of sampling discussed in Sections 1, 
10, and 11 could be used here, too. Suppose, however, that such a list 
is not available and it is deemed too costly to prepare. Suppose, 
furthermore, that it is desired to economize on the costs of locating 
dwellings by means of sampling entire blocks. The entire area of the 
city’s map is divided into blocks along identifiable streets and other 
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would the counting of 290 nonselectcd cards for each cluster of 10 
selections. In practice, some approximate procedure might be used, 
such as the use of a ruler to measure off the clusters. Let us say that 
of the 400 subscribers interviewed 148 gave a “yes” answer to a 
question. The number of “yes” answers out of the 10 interviews in 
each of the 40 clusters were also obtained. These arc, with the sections 
arranged in the same order in which they were selected: 

47113536232214463271 
^5 5468203569210347524 

The sample mean is calculated as before by dividing the sample total 
by the number of cases: 



In the present instance we have Xj — ^ 148 = .37. 

Denote by m the number of clusters and by A the number of elements 
per cluster, so that mA « n. (In the present instance wc have 40 X 1 0 « 
400). Let Xi denote the sum of the values of the characteristic x for the 
A values in the ith cluster; in our example, the 40 values of A', are 
given above as 4, 7, 1 . • • 5, 2, 4. The mean of the nh cluster will be 
denoted as JP, == -^ AT,, These 40 values may be found by dividing the 

values above by = 10; hence they will be 0.4, 0.7, 0.1, • • • 0.5, 0.2, 
0.4. These represent the mean of the characteristic for the dements 
in the cluster— that is, the proportion of subscribers in the different 
clusters who said “yes.” The subscript «cc" is used here to denote 
“equal clustcre,” whereas the subscript “c” is used later when the 
clusters are not necessarily equal. It may be noted that the sample 
mean is also equal to the mean of the cluster means; 



In our example, ^ ‘ 

In order lo obtain the sample mean, it is not necessary to go to the 
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and select work grouos, one might study the students of a university 
and select them in clusters of classes, one might select dwellings in 
clusters of blocks In any case, one must ensure that every element of 
the designated population belongs to one, and onlv one, of the clusters 
Othirwise, special measures must be taken 1 

CLUSTERS OF EQUAL SIZE Clustcrs of equal size rarely exist in 
society They come into being only as the result of planning Army 
units, the units of large housing developments, and some work sections 
m large establishments may sometimes be of equal size, or nearly so 
Sometimes, however, the sampler creates equal sized clustcrs where 
none existed before, as in the example given below Furthermore, it is 
common practice to create equal sized subsample clusters by the 
procedures of sampling unequal sized clusters with probabilities pro- 
portional to size discussed in Section 20 Clusters of equal size can be 
treated simply as a special case of samples of unequal size However, 
the subject of clusters of unequal sizes ts rather complex 

As an example, take the file of subscribers of a newspaper There 
are 12,000 subscribers served by earner routes There is a card for each 
subscriber in the file The 100 to 200 cards of each earner route are 
kept together, and neighboring routes follow one another An interview 
survey of about 400 subscribers is wanted, and m order to sSve travel 
lime It is decided to take clusters of 10 subscribers each It is estimated 
that 10 short interviews m the same neighborhood can be generally 
obtained in half a day’s work on the spot 

Now, imagine that the file is divided into 1200 clusters of 10 
consecutive cards each A sample of 40 of the 1 200 clustcrs is to be 
selected The drawing of 40 different random numbers from 1 to 1200 
would give a random selection of the clustcrs * However, our aim is to 
be practical, and in practice a systematic sample of clusters would 
generally be taken (sec Scctiois 10 and 18) Let us say that after a 
random start Irom 1 to 30, every 30th cluster was chosen, we thus 
have 40 clustcrs of 10 cards each Altogether 400 out of 12,000 were 
selected, each with a probability of selection of 1 in 30 

To divide the file into 1200 clustcrs would take some time, so 

•^«otethat‘ random telecuon ’here denotes 40 selections with equal 

probability from ihc 1200 uniu in the popuJalion The sample of 400 cases u 
obtained by means of only 40 choices This has important consequences for the 
sampling error, as discussed in the next section 



Soledlon of Ihe Sample 207 


of the entire population of 12,000 subscribers is between 30.0 and 44.0 
percent. That statement has a 95-percent chance of being correct and 
a 5-percent chance of being wrong. It should be noted that this 
formula for the variance is entirely appropriate only if the m clusters 
are made with m random choices. But for practical reasoiu the 
was systematic. The results of our systematic selection would be 
equivalent to a random choice only if the order of the clusters m t e 
file had been thoroughly randomized. They were not, an v/c 
told that “neighboring routes follow one another.” This matter will be 
treated in Section 18, where we shall find that for our exarop e 
approximation is not bad. . 

CLUSTERS OF UNEQUAL SIZES. When the cluster IS 
human group, it will usually contain varying num ers o 
This wiU be true of dwellings in blocks, employees m work 8™“?'- 
students in classes. There are several important “ 

planning and administration of the study must rec on ^ j 

that it has no exact control over the size (n) of the 
the sample of 25 blocks, mentioned earlier in this 
not contain exactly the planned 400 dwellings “ jdjeted. 

fewer, depending on the sample of blocks that app through 

Sometimes the variation in the size j dusters (sec See- 
the use of information on the sizes of the in simpler 

tion 20). Let us assume, however, that we J 

model in which m clusters are estimate of the mean is 

clusters which compose the population. . — S x/n. 

usually calculated as the simple mean of the if m is small and 

The estimate of the mean may be serious y Qjjap, 6). Thus, the 
there is inequality in the size of the c duster means, 

sample mean is no longer equal to a simp e analogy may be 

as it was for equal-sized clusters, ow * ^ of the cluster 

found in that the sample mean is cqu dusters. Thus: 

means; the weights are the relative 


— S ^ Jc.. Here N' 


. JL =* 


average number of 

. - . ^UcnccNjN' is the size of the cluster 

elements in the sample clusters. H 

relative to the average size in the ‘ analogous form to the 

The variance of this estimate 
variance shown before: 
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trouble of calculating the separate cluster values. However, wc need 
them to obtain the estimated variance of the sample mean: 

«t, var. M = (1 - 0 

m 

1 ** 

Here S6* = 7 S (J, — tltc variance of the cluster means 

w — * 1 

around tlie sample mean. Note the similarity of the formula to that 
for a simple random sample: here, too, we have a variance divided 
by the number cj independent sampling units. The units involved in the 
variance calculations are also the sampling units: the clusters. The 
quantity (1 — /) is again the usually inconsequential “correction for 
finite population.” With m clusters selected out of a total of M in the 
population, m/M is the sampling fraction, and (1 — /) = 1 — m/M. In 
the present instance we have (1 — y) = i — 40/1200 =» 1 — 1/30. 
We have: l 

= I fo X '•’«) = i 76 X -OSOA = 

.967 X .001259 « .001217. 

Here S (I, - jp.,')* = (-4 - .37)» + (.7 - .37)* + h (.2 - .37)* 

+ (.4 - .37)* » 1.964. 

The standard error is \/.U0l2l7 *= .035, and the confidence intervals 
are .37 i 2 (.035). That is, wc make the statement that the proportion 
of yes” answers that would have been obtained by a similar survey 

»More CMQvcnicnt calciUational forms are; 

«t. var. M - (I - /} .L (S jrl - m Jt^l) J 

35 • . S - xiUlw p" - TO 
Here 2Ar<» - 4* + 7» + 1* . . • + 2* + - 744. 

39 * Tm *" 40 *“ .0504. This quantity, as above, U the value of 

Ji* , the estimate of the variance of uogle clusters in the population. 
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■y/d -j) P' X = v' 

also 1.96 X .024 = .047. 


,000565 = .024; 


Hence he makes the statement that the population value was within the 
limits of 37 percent ± 4.7 percent, and he would say that this 
has but r ' i"'-orrect. We are wiser; we know 


)f37 percent ±4./ percent, ana new j ^ 

has but a 5-percent chance of being incorrect. We are wiser, we 
that the true standard error is .035. Hence 4.7 percent is <=9“^"°“° 
1.96 standard errors but only to .047/.035 = 1.34 f 
the appropriate tables we find that the probability “ ® ^ ' 
on a confidence interval of 1.34 times the standard ^as a prob 
ability of 0.18 of being incorrect. Hence, the of the re^ 

searcher’s mistake is that his confidence, an t a “ ^ 

his results is misplaced: his statements will be incorrect in the long 

run not 5 times in 100 but 18 times in 100. .-rious Yet it is 

The effect of clustering on epor of 

disregarded very frequently It is 9 «.to oomm» - 
estimates which grossly 0"''"““"’® ' significance of the 

sample and consequently StosslV ^ 

sample results. Marks (17) found that fo^he revision ^ ^ 

Binet scale the “"therefore, the researcher using 1.96 

to the incorrect sly/n\% 3.36. inc j 06/3.36 = 0.58 of 

of his incorrect "standard errors’* is Incorrect intervals the 

the actual standard error. With the use o 
researcher would be making incorrect statement not tim 

as he hoped, but 56 times in 100. clustering can be given in 

A convenient measure of tne c actually is 

terms of the coefficient of intraclass ^ estimated from two 

based on the ratio of two variances, ‘ p, 5 ,„ sample is: (1 ~f} 

values we already have. The variance o ^ q00565 is a usable esti- 

siVm = .001217. The value of (1 -/> , random sample 

mate for the value of the variance to 
of the same population would be su jec . 

— 1 + cst. rho {A ~~ O- 
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«t. var. (V) = ('-/) -^[ 2 (^;) (*• •- *•')']■ 

Thai is, the squared deviations now are multiplied by the squares of the 
telatWe weights of the clusters. 


13. The Effects oj Clustering; Intraclass Correlatton 

Of the various modifications of sample design (listed in Section 7), 
clustering usually has hy far the greatest effect. If we compare a sample 
of n independently selected elements with another sample containing a 
like number of elements but selected in m clusters, we note that the 
number of independent choices involved is reduced considerably. 
Although the former sample would be well spread over the population, 
the latter would be bunched in spots. In the example in Section 12, 
the sample of dwellings was confined to only 25 of the city’s 750 
blocks. This clustering would be of no consequence if all the elements 
in the population were scattered at random into the different clusters. 
In most practical cases, however, we find that the elements in a 
cluster tend to be more like other elements in the same cluster than 
like elements in other clusters. The various dwellings in the same 
block will show a greater homogeneity than a similar number of 
dwellings scattered throughout the city. The measure of this homo- 
geneity is the intraclass correlation. 

In the example of newspaper subscribers of Section 12, we found 
that the estimate based on 40 clusters of 10 subscribers each was 
subject to a variance which was estimated as: 

rat. var. (»„') = (\ ~f)lL ^ .001217. 

tn 

The standard error was estimated at \/^oT2l7 =* .035. 

Suppose that after this duster sample was selected the researcher 
mistakenly used the formula for the standard error of a simple random 
sample, which is inappropriate for the sample design. What is the 
consequence of this mistake? In our example he would take (see Sec- 
tion 1): 
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another. These tendencies to homogeneity in groups hold for most 
characteristics. The homogeneity of people— the segregation o c ar 
acteristics— is greater than would exist if people were se cct ^ into 
groups by random choice. These tendencies may be due to se cction, t 
mutual influence, to the correlation among characteristics, or o a 
combination of these. In any case, the tendency is a social 
in that rho is a measure that belongs to the group as sue , i 
meaning for the individual except in so far as ^ , . 

member of a group (or cluster). Rho is a measure of 
total variance which elements m the same sampling , 
cluster, have in common. It may be looked on as 
of homogeneity or segregation of the elements wit in j 

Given a set of elements, the greater the 
of simUar elements, the greater is die 

It should be of great utility in the for a char- 

description and comparison. Of , different variables will 

acteristic. For the same grouping of indivi > . . yjjuj of 

ddiibit different rhos. Also, for the 

rho depends on the actual distribution of the popuiai 

groupings being considered. importance 

That individuals are segregated in g approxima- 
te the sampler because he often uses those gr p Therefore, 

tion thereof) as sampling units in the , ^.justers used in 

the variance of sampL results will be inAf ^ults of a 

the selection process. We see, then, that the P" numbers of 

cluster sample cannot be given simp y in . units selected 

elements (cases) in the sample. The num rs used— will be of 

-U, the numbers of the different kmds of clusters 

importance also. 


PRACTICAL procedures 


n almost alwap subject 

The results of a sample based on the same number 

greater sampling error than use clustering in selection 

dements selected individually* 
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In our example, A (the number of elements in each cluster) is 10, 
Hence we have: 

1 + cst. rho (10 - 1) = = 2.15, 

j . 2.15-1 , 

and est. rho = — -f-.U. 

Thai is, the variance of the cluster sample in this case is estimated as 
2.15 times greater than the variance one would expect from a simple 
random sample of the same number of elements. This ratio of the 
variances can be expressed as due to an estimated rho of +.13. 

When rho is positive, the ratio of the variances is greater than 1 
and the cluster sample has a greater variance than a simple random 
sample of the same size would have. The maximum possible value for 
rho is +1, in which case the ratio has a value of A. This corresponds 
to a case of complete segregation of a characteristic: all the individuals 
of every cluster ate exactly alike with regard to that characteristic. 
Note, also, that the effect ofclustering is equal to rho (li — t);hence,a 
relatively small rho may have a serious effect on the variance if the 
size of the cluster (^4) is large. In the case of clusters of unequal sizes, 
the average size of the cluster can be used in place of A. In a 
subsampling design (Section 19) the average number of subsampled 
elements per cluster plays the same role. 

The variance of a cluster sample is almost always greater than 
that of a simple random sample of the same size. However, this is a 
sociological fact and not a logical n«»ssity. For example, if balls are 
selected from a “well-mixed um,” it makes no difference whether one 
takes them in clusters or singly; the balls will be randomly distributed 
in the clusters. In this situation we would expect a value of 1 for the 
ratio of the variances and a value of zero for rho. However, people do 
not enter human groups “well mbccd.'* Furthermore, although a nega- 
tive rho is possible, it b a rare phenomenon for social variables. The 
lowest possible value is — — ^ corresponding to a zero value for the 
ratio. 

For most human groups, rho tends to be positive. That is, the 
individuab associated with human groups tend to resemble one 
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Alternately, one may begin with a fixed precision he wanu 
the sample to have. Suppose, now, that our purpose . , , . , 

the same characteristic on another occasion but ‘ P , 
a precision of 6 percent is required at the »-«^f°h^h. ty level ^ 
means that a standard error of 3 percent .03 is^req 
the variance of the sample mean is to be \.-f , Usine 

two designs will yield a sample of that precision f- ^ “*^^3 
the equaLn for tL variance of a closer sample, we find the number 
clusters necessaiy' for the desired variance o 


.0009 = 


.0504 


m for variance of .0009 — oo09 


= 36 clusters. 


. • „f a dmole random sample of 

Using the equation for the variance 234/.0009 = 
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Because the cost per element in cluster sampling is less, often sub- 
stantially so, than for a sample of individually selected elements. 
Clustering should be preferred ovcrinditddual selection in so far as the 
lowering of the cost per element due to clustering is greater than the 
increase in the variance per element. 

Let us begin with the data of the 400 subscribers in 40 sample 
clusters in Sections 12 and 13. Suppose that the same characteristic 
is to be measured again next year and the cost of the study is specified. 
We want to compare two sample designs in order to use the one which 
yields the smaller variance for the available expenditure. First, estimate 
how many sampling units can be obtained for the available money 
under each of the two designs. Suppose that it is estimated that with 
one design 300 subscribers in 30 clusters may be obtained; or one 
could get 100 subscribers selected individually. Which of those two 
samples will have the smaller variance? The variance of the cluster 
sample is: 

* jk* 

approx. v?r. (s*,^ » — . 

m 

The factor of (1 — m/Af) is neglected in this discussion, and it is seldom 
of any consequence (sec Section 1). We found in Section 12 that in our 
example the variance of individual clusters is estimated as .0504. 
Therefore, the variance of a sample of 30 clusters is estimated as 
.0504/30 *= .0017. 

The variance of the simple random sample is 

j* 

approx, var. (s/) = — •. 

n 

Again, wc neglect the factor (I — J). Since wc deal with a proportion: 

_ np' (1 - p^) _ 400 (.37) (.63) . 

« - 1 399 

This is a usable estimate of the variance of individual elements. (The 
formula was given in Section 1, and the value of .37 was used in Sec- 
tions 12 and 13.) Therefore, the variance of a sample of 100 elements is 
estimated as = .00234, Since for the available money the cluster 

sample yiridi the smaller variance, the greater precision, it should be 
preferred. 
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Alternately, one may begin with a fixed precision that he wants 
the sample to have. Suppose, now, that our purpose is again to measure 
the same characteristic on another occasion but now it is specified that 
a precision of 6 percent is required at the 0.95 probability level. This 
means that a standard error of 3 percent = .03 is required. That is, 
the variance of the sample mean is to be (.03)* = .0009 \ hie o t e 
two designs will yield a sample of that precision for less cost. Using 
the equation for the variance of a cluster sample, we find the number ol 
clusters nece.ssary' for the desired variance of .0009. 

„ .0504 

.0009 = — — oc 
tn 
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Because the cost per element in cluster sampling is less, often sub- 
stantially so, than for a sample of individually selected elements. 
Clustering should be preferred over individual selection in so far as the 
lowering of the cost per element due to clustering is greater than the 
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Alternately, one may begin \Mth a fixed precision that he wants 
the sample to have Suppose, now, that our purpose is again to measure 
the same characteristic on another occasion but now it is specified that 
a precision of 6 percent is required at the 0 95 probability level This 
means that a standard error of 3 percent = 03 is required That is, 
the vanance of the sample mean is to be ( 03)* = 0009 Which of the 
two designs will yield a sample of that precision for less cost’ Using 
the equation for the vanance of a cluster sample, \vc find the number of 
clusters necessary for the desired vanance of 0009 
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the comparison of any number of alternative designs. For example, in 
the case above we might investigate whether a cluster of 3 or 5 or 20 
elements might be superior to both single elements and to clusters of 10. 

If the cost per element were the same for the clusters of 1 0 elements 
as for the individual elements, the latter would usually show up better. 
It will generally yield the same variance for smaller n or the smaller 
variance for the same n. However, the cost per element is generally 
not the same for elements in clusters as for elements selected indi- 
vidually. Hence, the comparison should be made between variances 
per unit cost. 

As these comparisons arc made, it becomes obvious that the 
number of elements— say, interviews— in the sample is not tlic sole 
important factor as regards cither the cost or the variance of sample 
designs. The numbers of all other sampling units selected— such as 
dwellings, blocks, towns, cities, counties— are also relevant* Further- 
more, other aspects of design— stratification, vaiydng probabilities of 
selection, and methods of estimation— should be considered also 
However, in most social studies those aspects are not so important as 
the effect of clustering, either for the variance or for the costs of the 
study. 

Three important questions for which we do not have definite 
answers are pertinent: 

(а) Where may we get the needed estimates of variances and of 
cost factors? We can prepare them on the basis of past experience svith 
similar surveys, ask an expert or conduct a pilot study. Good estimates 
of cost factors are especially difficult to get. 

(б) How do N/e find the precision needed’ The user of the statistic 
should determine this. Usually this factor is a much greater source of 
uncertainty than those in (a). Theoretically, one might say that a 
sample is too large if the statistics it yields arc more precise than is 
warranted by the uses of those statistics; that a sample is too small if its 
smtistics are not precise enough to aid in making decisions based on 
t cm to an extent commensurate with its cost. According to this 
rational picture, the desired precision would always determine the 
sample size, and the necessary cost would be allocated in accord with 
that determination. Actually, we seldom find the necessary precision so 
well defined-or the available funds so fluid. 

(f) For studies with several or many important characteristics to 
measure, how do we evaluate the relative importance of each? How 
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do we arbitrate between the conflicting answers on the desirable size 
and design of the sample? This area of decisions is even more obscure 
than the preceding one. 

Despite these difficulties, theory does provide general outlines for 
the decision on the required sample size and it can be of great help in 
choosing an efficient design. Fortunately, moderate departures from 
optimum design do not incur heavy losses in economy. Hence, rough 
guesses may serve in place of better estimates, and useful compromises 
between conflicting aims in design can often be made. 

The designing of samples is mainly an engineering job; the avail- 
able theory and the knowledge of results with similar materials should 
be utilized to produce a desired result with the available resources 
and with the greatest economy. This, at any rate, should be the goal, 
but the use of the superlative in the preceding sentence is immodest— it 
usually represents a level of aspiration rather than of achievement. In 
making economy the aim, wc should understand that cost is to be 
understood broadly to mean effort in general. Since effort as well as 
money available for research is limited, economy helps to increase the 
total quantity and quality of research output (1, p. 50ff). 

15. Praclicality 

A probability sample cannot be created by assumption, nor will it 
be “given,” as in the examples of elementary statistics. The dictum of 
"quota” samplers to their interviewers, “Go out and get a random 
sample,” is most impractical. The interviewer is not capable of doing it, 
nor is his dispatcher. The need for a mechanical method of selection 
has been stated in Sections 4 and 5. Now we want to emphasize the 
need for taking care to translate the theoretical model of selection into 
a complete set of simple, practical instructions. 

It is necessary to give the field interviewer simple and clear in- 
structions for the carrying out of his tasks. The Jess attention the 
sampling instructions demand of him, the more he can devote to his 
princip^ and difficult task of interviewing. For example, in order to 
identify a sample segment the interviewer should not be asked to locate 
a long arbitrary straight line marked on a map— but he can locate a 
street. His sampling instructions should be confined to locating streets 
and addresses, listing occupants of the household, and so on. These 
tasks arc difficult enough in some cases. 
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Of course, “clear, simple, practical and complete instructions’* 
represent aims rather than fulfillment. Actually, in the fitting of 
practical field sampling procedures of large-scale samples to the 
statistical model of the design, there will often be some unfilled gaps 
(sec Section 16). Part of the art of practical work is in guessing what 
irregularities, where and how much, one can afford to tolerate. 

The sample design is no better than the weakest link in the entire 
procedure. Each sample design is an adaptation of sampling theory to 
the resources at hand. The resources include the distribution of the 
population, the facilities for communication, the nature and training 
of the field force, and the researchers engaged in the task. They also 
include the receptivity of the administration as well as of the users of 
research— their receptivity to, and understanding of, the methodo- 
logical tools. 

16 . Nottsampling Errors 

In this chapter the verbal statements of the confidence interval 
refer to “the population value.” This is defined as the value that would 
have been obtained if the entire population— rather than just a sample- 
had been designated for observation. This definition deliberately 
avoids the “true value,” the parameter, because there are sources of 
error which exist even if every element is designated for obsen.’ation. 
These errors arc called nonsampHng errors, to distinguish them from 
the sampling errors which arise because only a part of the total popula- 
tion is designated for observation. The nonsampling errors are some- 
times called errors of observation, or errors of measurement, or errors 
of response (3, pp. 15-52; 22, pp. 9-16). They occur because observa- 
tions have to be made to obtain some needed result and because the 
physical procedures of observations arc subject to imperfections. The 
sampling errors occur when the observations are made on only a frac- 
tion of the population. 

Nonsampling errors may be of two types: oariable’Tesponse errors and 
biases. IVithin the variablc-rcsporae error are included all those errors 
in the procedures of observation (interviewing, coding, punching, 
nonresponse, etc.) which tend to cancel each other in the long run. 
On the contrary, we include under the term bias all the discrepancies 
between our observations and the quantities we aim to measure of the 
systematic noncanccling type. The size of this bias is unknown in 
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practice. Thus the population value obtained in a census by means of a 
single observation on each individtial is subject to error. Because of 
the theoretical fluctuation due to the variable-response error, we may 
think of a distribution of possible population values. The mean of this 
theoretical distribution of population values is the “expected popula- 
tion value.” The difference between this value and the “true value” of 
the parameter is the bias (1, pp. 292-317; 11, pp. 147-154). 

In addition to the nonsampling errors of bias and of variable 
response, sample studies are also subject to sampling errors. In the 
general study design, all of these errors may be considered together as 
constituting the total error oj Ike sample. This total error is the square 
root of the sums of squares of two quantities: The first is the standard 
error of the sampling distribution, the sampling error. The second 
quandty is the combined effect of the two kinds of nonsampling errors 
which we called variable-response crmr and bias. That is; (total error)* 
« (standard error of sampling)* -1- (non-sampling errors)*. This rela- 
tionship may be illustrated by means of the three sides of a right 
triangle: 



STANDARD 
ERROR OF 
SAMPLING 


The total error depends on the length of both of the legs and cannot be 
shorter than cither of them. The standard error leg can be shortened 
sometimes by a change in sample design, and always by taking more 
sampling units— either more clusters or more individuals. Of the non- 
sampling errors, the variable-response errors may be reduced either by 
taking more of something— observations per individual, or individuals, 
or interviewers— or by improving the precision of the methods of 
observation. But the length of the nonsampling leg may be due mostly 
to bias, which can be reduced only through better survey procedures: 
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through improving the questionnaire, or the field work, or the coding 
and processing, etc * 

It may be wasteful to spend much money on a large sample m 
order to reduce the standard error if the nonsamplmg errors arc 
allowed to remain large, and vice-versa In the general study design, 
the nonsampling errors should be considered together with the sampling 
error, because together they constitute the total error of the survey 
An important special class of nonsampling errors in social studies is 
composed of the errors of nonresponse These anse whenever a member 
of the population designated for the sample is not included in the 
results— because his answer was “not ascertained,” because of total 
refusal of the interview, because of not being at home, because of 
illness, or for similar reasons This is a nonsamplmg error, it can occur 
even if the entire population is designated initially for the study In 

s( iPp 92-304) Thcrcductionofnonresponscisanimportant 
Tlf '”>■^'"8 and the structure of the 
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in the other \Vhat about^) the 
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VARIOUS STATISTICS 
17. About Various Stalislies 

The discussions in this chapter were built around problems of 
estimating the mean of a population. The restriction was for the sake 
of simplicity and convenience. The choice of the mean rests on its 
basic imjjortancc. The estimation of totals for the population is closely 
related to that of means (as discussed in Section I). Furthermore, 
proportions are but a special case of the mean, and a great deal of 
social research is reported in terms of proportions, of frequency distribu- 
tions in classes. The distributions may be of attributes, behaviors, 
attitudes, or opinions; they very often arc expressed as proportions of 
the total. 

In addition to the mean for the entire population total, the estima- 
tion of means for subdivisions of the total population is often of great 
importance. These have been called “domains**: “Any subdivision 
about which the enquiry is planned to supply numerical information 
of known precision may be termed a domain of study” (21, p. 5). In 
general, the, principles discussed in this chapter in terms of means of 
the total population apply as well to the means of the domains. Hence, 
the results based on cross-tabulations present no new problems in 
principle. However, it is true that, compared with the total population, 
in dealing with a domain one gets more often into “cells” so small 
that the problems of small samples, particularly the questions of non- 
normality, become important. Hence, the reference to available tests 
for “distribution-free** estimates is particularly relevant here (see Chap. 
12). Furthermore, the effect of clustering becomes less drastic for 
many of the domains than for the entire sample because of the smaller 
number of elements per cluster. 

Sometimes a researcher will say that the question of sampling is of 
no interest to him because he wants not to estimate quantities but 
merely to measure relations. This view may overlook the fact that the 
relationships are measured in terms of statistics: in comparisons of 
proportions, in correlation coefficients, etc. Tftese statistics, too, depend 
on the individuals included in the sample. When some relationship is 
expressed in terms of a number based on sample data, that number is a 
statistic, a sample estimate of a population characteristic. The statistic 
is subject to sampling error, and the sampling error can be expressed in 
terms of a confidence interval. 
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“Tests of significance” of sample results are basic in research. The 
statistical tools required for many tests of significance are the same as 
those which are necessary for the construction of the analogous con- 
fidence intervals.* Suppose, for example, that it is desired to make a 
test of significance between two means, each pertaining to a domain of 
the study. The statistic used in the test is a ratio; the numerator is the 
difference of the two means, and the denominator is the standard 
error of that difference. The variance of the difference for the case 
of two independent simple random samples is simply the sum of the 
variances of the two sample means. For more complex designs, another 
term has to be considered— the covariance of the two sample means 
brought about by the design. 

A frequently used test of the presence of relationship is the chi- 
square test based on the 2X2 cells of two dichotomies. In the case of 
simple random samples, that test b very similar to the test of the 
difference of two proportions (20, p. 203). In case of a more complex 
design, the chi-square test is not valid because its underlying assump- 
tions of the independent selection of sample cases is violated by the 
sample design. But the test of the difference of the two proportions 
may be made. This requires that the correlations due to clusterings 
and to the other complexities of the design be considered. 


COMPLEX SAMPLE DESIGNS 


18. Suatijied Cluster Sampling 

In Section 12 we dealt with a sample of subscribers as an example 
o^ cluster sampling. The estimated variance of the sample mean was 
pven as (1 — y) s^/m. That formula b applicable to the case where 
the sample of m out of M clustcra represents m independent random 
sdections. It b basic to the more complex designs. However, the 
ample random choice of clusters b not used frequently in practice. 
Stratification is used generally. Very frequently the selection is sys- 
tematic, as in our example of the dusters of subscribers. If the luting 
of the clusters had been random— as after a thorough shuffling— the 


. .. o* logical basis of the null hypothesis which underlies tests of 

^ificance, toe comcuon for finite population (1 — /) should not be Included In 
the variance formulas when these are used in tests of significance (3, p. 247). 
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systematic selection would be equivalent to a random choice. It was 
stated specifically, however, that neighboring routes were filed next to 
one another. Therefore, the similarities, slight or great, that might 
exist in neighboring routes would be reflected in a selection of clusters 
rather evenly spread over the different neighborhoods of the area. We 
shall assume that the chief effect of this systematic selection of the 
clusters is to yield a stratified sample of them (see Section 10). 

Each selection came from its dwn ‘‘implicit stratum” of 30 clusters 
(the interval was 1 in 30). However, we shall use an approximation 
called the method of “collapsed strata” and assume that each successive 
set of four selections was selected by random choice from a group of 
120 clusters. Thus, we shall have 10 strata with the equal weight of 
1/10 for each. There are four equabsized clusters selected from each 
stratum. The sample mean is calculated as before, but the sample 
design now assumed calls for a different formula for obtaining the 
estimated variance of the sample mean. We need here a specific 
application of the general formula for stratified samples (Section 8): 

jt 

cst. var. (JPwO * [cst. var. (.?*')]• 

The quantity within the brackets is the variance within one of the 
strata. Within each stratum wc have a sample of four equal clusters 
out of 120. Hence (as in Section 12) wc have in the Ath stratum 


est. var. T T ^ *• 

Then, for the entire sample, we have 
10 / 1 V 

est. var. (Xec.wO = 2 ( -Jq j 




10 4 

In the above, S S (iPf — 


i* the term we need to calculate. 
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There are ten terms, of winch the first is: 

[2 (J. - = {-4 - .325)' + (.7 - 325)' + (.1 - .325)' 

+ (.1 - .325)' = .2475. 

(!„'), = i {.4 + .7 4- .1 + .1) = ‘c (1.3) = .325. 


The sum of the ten terms (of which the first ii 247 5) equals 1 .505. Hence: 
est. var. .) = .967 1.505 = .00121. 


In our case this approximation did not pioduce a variance smaller 
than the .001217 obtained without considering any strata in Section 12. 

We might have used 20 strata of two clusters each. Then the 
variance would come from the 20 differences between pairs of clusters. 
A formula used in systematic sampling, and mentioned in Section 10, 
utilizes all of the 39 successive differences. That is 

«t. var. = (l - ^ ^ ^ 

29 1 1 1 1 » »’ 

“3040T391OT^*‘'^‘ ■ “ 


. 344 .00107 


Here 2 ^X, - = (4 - 7)’ + (7 _ ])! + (i _ ])' + ... + 

(5 - 2)' + (2 - 41' = 344. 

The three separate calculations of the variance of ihis sample did not 
Viidd a.CTsc^saW.7 i-Seietn Tcsnlvs 


tit u easier to calculate the equivalent term: 

— [W - 1 (i*).] - .jL [oa + „ + ,+,)_ 1 . .2475 

The sum of the ten terms u* 

j ® p4r,. - -1- ( j*).] f _ J<- pAT^ - -L I (SAT,).] 


1 505 . 
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In Sections 19, 20, and 21, the selection of subsamples is discussed. 
The examples discussed in each of these sections deal with samples to 
which the calculation of the simple mean Sx/n is proper. The calcula- 
tion of variances for subsamples in general will not be discussed in this 
chapter. However, if the subsamplcs in all the clusters are of equal 
size (or very nearly so), the formulas for the variances of the mean arc 
similar to those given just above. Equal-size subsamples are obtained 
with systematic selection with probabilities proportional to size. An 
example from Section 20 is the subsample of 1 0 employees each from a 
sample of 40 work groups. A satisfactory approximation might be: 

cst. var, (X,.,/) 


The use of x/ rather than x* denotes the fact that, since we have only a 
sample of the employees of any section, we deal not with the exact 
value of Si but only with its estimate x/. The subscript denotes a 
sample composed of subsamples of equal size. 

No formulas for the variance svill be given in this chapter for 
stratified samples based on clusters or on cluster subsamples which are 
of different sizes. The possibilities for varieties of procedures both of 
selection and of estimation arc very great. Differences which may 
appear minor to the linpracciced eye may have important effects 
on the design. It is wise to consult a sampling statistician with the 
design during the planning stage (1, pp. 234-267). 


2m (m - 1) ^ 

~ 2 X 40 X 39 ^ ~ 


19. Suhsampling. A Two-sfage Sample 

Suppose that a sample is to be taken of a city’s dwellings and that 
two designs are compared, with the aim of making a choice between 
them. One calls for 400 dwellings to be selected individually from the 
listing of the city’s total of 12,000; the second calls for a cluster sample 
of 25 of the dt>'’s 750 blocks (as described in Section 12). In each 
case we have a sampling fraction of 1 in 30, a probability of 1 in 30 
for every dwelling in the city. Wc arc comparing two samples with the 
same expected number of elements. The objection to the sample of 
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individu.1 dwellings is that its cost per element is high Jhere ar^an 
average ofld dwellings per block; hence it tate on the 
two blocks to produce a sample dwellmg The 
dwellings may not be serious within thw moderate-sized c ^ g 

in other eases of clustering the cost of travel may be of the great«t 
importance (see Section 21). However, no “‘'sfactory listing of the 
dwellings exUts. Thus 29 nonsample dwellings would need to be listed 
for every sample dwelling and this additional burden increases the 
cost per element appreciably. 

The chief objection to the cluster design is that the variance per 
element is high. An average of 16 dwellings per block means that the 
ratio of the variance of this sample to the first design is [1 + r o 
(16 - 1)1 (see Section 13). Hence, for example, if for a charactenstic 
rho is 0.1, then the variance of the cluster design will be 2.5 times 
greater than that of the first design. The trouble is that with the cluster 
design the sample is confined to only 25 blocks in the city. The spread 
of the sample is very restricted. 

This is the general rule: with the increase in the size of the cluster 
the cost per clement decreases but the variances per clement of the 
sample estimates increase (see Section 14). We may then look for a 
compromise between these two factors: on the one hand, wc wish to 
spread our sample into as many clusters as possible in order to include 
in the sample the diverse elements in the population; on the other, 
we wish to keep down the costs incurred by covering many clusters. 
In other words, we wish to look for that cluster size which yields a 
good compromise between the two conflicting effects of clustering. 
That is, after comparing^ the economy of the two designs, we look at 
other designs—in this case for some compromise in the form of smaller 
clusters. Ho%vcvcr, the sizes of clusters are often such that it is incon- 
venient to split them. For instance, most city blocks have streets for 
boundaries, so that their areas can be locati^ with relatively simple 
instructions, but a fraction of a block is hard to define.^ 

Suppose that instead of the sample of entire blocks we investigate 


^Comparison U in the manner discussed in Section 14. The present discussion 
is in tenni of a constant number of elements (m), but it may be translated into the 
snore rationai groimds of a fixed allowable variance or cost. 

tin Section 21 another reason for lubsampUng appears, an Increase in the 
effective spread of the sample without Incurring a proportionate increase in costs 
(3,p.l89). 
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a third design that would triple the number of clusters in the sample: 
it would take 3 X 25 *= 75 blocks, and an average of 16/3 = 5:3 
dwellings in each. The intraclass correlation does not change, since 
it is a property of the actual block but its effect becomes less than one 
third as great as before. In a calculation of the effect of clustering, the 
size of the average subsample taken from the cluster appears as the 
multiplier: for the case of rho *= 0.1, wc have [I + 0.1 (5.3 — 1)] = 
1.43 as the ratio of increase of variance over the first design. But this 
is considerably less than the ratio of 2.5 of the second design over the 
first. On the other hand, the cost of the third design is slightly greater 
than that of the second (75 blocks in contrast with 25) but perhaps 
in this case considerably less than that of the first. 

Suppose that after examination of a few other designs (say, 
clusters of 2 and clusters of 8), the third design, of clusters of 5.3 
dwellings, is adopted. The plan is then to select 1/10 of the blocks 
and to select 1/3 of the dwellings within the selected blocks. The 
probability of selection of any dwelling unit Is 1 /lO X 1 /3 = 1 /30 as 
before. Note that the inclusion In the sample of any dwelling depends 
on its being selected in two separate events. There are two stages of 
selection: first a sample of blocks is selected; then from the list of 
dwellings found within the sample blocks a sample of dwellings is 
subselectcd. This is an example of multistage sampling in general (see 
Section 21) (1, pp. 215-267; 3, pp. I35ff and 372-397). 

At each stage of sampling there must exist a listing procedure for 
all the sampling units among which the selection will be made (Sec- 
tion 5). The listing of the blocks of the city is accomplished by the 
division of the map into blocks and the numbering of these blocks. 
After the sample blocks arc selected, a listing must be made only of 
the dwellings in the sample blocks (15). The procedures of selection 
in the two stages are determined separately, and each of them may 
be a simple random selection or stratified, systematic, etc. In the 
selections of blocks and dwellings, as in many other undertakings, the 
use of systematic samples is quite usual. In our example the blocks 
would be selected with the interval 10 after a start with a random 
number from 1 to 10. In each sample block an interval of 3 would be 
applied to the listed dwellings after a random start from 1 to 3. Then 
we would obtain in each sample block a number of sample dwellings 
which is closely proportional to the total number of dwellings listed 
in the entire block in the proportion of 1 in 3. However, there would 
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be divergences of 1/3 and 2/3 of a dwelling from the exact frac- 
tional ratio 

The numbering of the blocks may be used to introduce stratifica- 
tion All that IS ncccswry is to have relatively similar blocks (by 
geography or other variables) numbered consecutively The interval 
applied to that list will select one block from each group of blocks 
equal in number to the interval— in this illustration, one from every ten 

Tlic design has general applicability similar considerations in 
other situations may lead to the same design of subsamphng For 
example, we might use an interval to select among all the work groups 
of a factory, suhsamphng with another interval for employees within 
the work groups 


20 Subsamphnp with ProbabtUltes Proportmal lo Size 

In Section 19, a sample design was described for a two stage 
nmplc of a city with an interval of 10 for selecting blocks and an 
Interval of 3 for selecting dwellings within the selected sample blocks 
The probability for selection of any dwelling m the city was 1 /lO X 1 /3 

« 1/30 The procedure for selecting sample dwellings with the use of a 

const int interval within the blocks will yield in each sample block a 
number of sample dwellings which is closely proportional to the total 
number of dwclings existing m the block But that number of dwellings 
will vary from block to block-somctimcs a great deal On the other 
hand for reasons of cfTicicncy and convenience,, we would like to keep 
the number of sample dwcUmgs more nearly constant However, were 

whlnn W increasing the sampling interval 

It im the larger blocks wc would thereby decrease the probability 
0 election of the dwellings w.thin these larger blocks and thereby 

,lw probability (of 1 in 30) for all the 

dwelling units in the city 

If remain true to that aim and also accomplish our purpose 

r-ifirt J ® probability of selection of any block in the same 

fwVx! probability of selection within that same 

biTCk Hius. in the process of numbenng blocks a block which appears 

lavc twice the usual number of dwellings is given two con- 

.ccuuvc block urn, 

doubled from 1/10 ,o 2/10 \Vhcn ouc of these blocks falls mto the 
laniple, the interval of sampling within tt is doubled, that is, every 
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sixth address is selected after a random start between I and 6. Hence, 
the probability of selection of a dwelling in that block is 2/10 X 1/6 = 
1/30, as required (3, pp. 393-397)- 

The over-all probability of selection of any dwelling in any block 
is the product of the U%'o probabilities: the probability of selection of 
the block, and the probability of selection of the sample dwelling within 
the sample block. During the process of numbering the blocks we 
assign block unit numbers as “measures of size” (Zi) in accord with 
the number of apparent dwelling units in the block, and this measure 
Zt is used in opposite and balancing ways in the two intervals pertain- 
ing to the two stages of selection- It is to the list of consecutive block 
unit numbers that the interval 10 is applied. The probabilities of 
selection are now represented for the two stages as: 

10 2Zi ~ 30’ 

Here 1/30 stands for the over-all probability of selection of any dwelling 
in any block in the city- The probability of selection within a specific 
(the ith) block is represented by l/ZZr, that is, by an inten’al of 3Z<. 
The measure of size Zi varies from block to block; for most blocks it is 
one, for some it is two, and for others three or more. 

The probability of selection is determined by the application of 
the sampling intervals and is independent of any inaccuracies and 
deficiencies in the measure of size (Zt). In so far as the measures of 
size are made proportional to the total number of dwellings in the 
block, the procedure yields equal numbera of subsamplcd dwellings 
from blocks which vaiy in total size. In practice this process cannot 
be complete, and it need not be. The subsample will vary somewhat, 
but it is enough if most of the variation can thus be eliminated (15). 

Where does one obtain the information he needs on number of 
dwelling units in each of the blocks of the city so that he may assign 
measures of size? There are several sources of materials, one or more of 
which may be a^'ailablc for the city. Some of these are census block 
statistics, aerial photographs, the city engineer or the city planning 
office, the real estate board or the chamber of commerce, the local 
bank, the local nc%vspapcr, some local public utility, etc. Other 
sampling materials, as well as advice, may in some cases be obtained 
from the Census Bureau. If all these fail, one may resort to some cheap 
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rough estimates (perhaps just a glance from a moving car) to be made 
for all the blocks of the city or for a large sample of them. 

The use of block units (ZO « aimed at equalizing roughly the 
number of dwellings subsampled from blocks which contain unequal 
numbers of dwelling units. This procedure may be looked upon as an 
example of selection with “probabilities proportional to a measure of 
size.” Let Pi denote the estimate of the number of dwelling units in 
the block. And let us say that we want to obtain an average of 5 of the 
(estimated) dwellings per block. Then we can take for our two stages 
of selection the probabilities of P,/150 X S/Pt — 1/30. A procedure 
for selecting the blocks may be the following: (a) Assign measures of 
size Pi to each of the blocks (i) Arrange the blocks in some desirable 
order; stratification may be obtained from the systematic selection 
because a block will be chosen from each successive sum of 150 of the 
Pii. (c) After a random start from 1 to 150, apply the interval of 150 
to obtain the selection numbers, (d) Cumulate the measures of size Pi, 
taking them in the prearranged order. This cumulation can be done 
easily on an adding machine: whenever the addition of a block causes 
the sum to reach or pass one of the selection numbers, that block is 
selected into the sample. Each block thus has a P*/150 chance of 
being chosen. 

The procedure for selecting dwellings within the block involves 
the listing of all the dwellings in the block. Then the interval of Pi/S 
is applied to the listing of the block, after a random start from 1 to Pi/5. 
Note that In order to avoid the inconveniences of an interval less than 


1, in assigning them we should take no number smaller than 5 for P,. 
Some special measures may be advised for the smaller blocks: they 
may be put in a special stratum, or V-v may be attached to the larger 
blocks, or they may simply be a- larger number arbitrarily. 

On the other hand, blocks so lajjythat^ l^-eatcr than iSO may be 
selected into the ssmplc ware d^n once “k ts sefected twice, ■ 

^hen the interval for selecting dwcUin^^ ^g^P*^oiild be halv^' 


'ould be halv^'' 
climina^ 


of the varying measures of climinafte 

the subsa m ples obtained from the uSs blocks. Fo^ 
of dwellings in the block will not be, in general, \ 
of the sampling interval. Therefore, the number of 
j obtained may be different (by the fraction 
that on the average Another source of 
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variation is more important: it may be summarized as arising from 
the measures of size not being exactly proportional to the actual 
number of dwelling units listed in each of the blocks. This divergence 
may be due to a variety of reasons: inaccuracy or obsolescence of the 
source of the data, inaccuracies in listing, differences of definition, or 
differences in the units of measurement. All these possible differences 
will affect the actual size of the subsamples from the sample blocks. 
However, none of these change the probability of selection of any 
dwelling in any block because that is kept constant (at 1 /30) by the 
product of the two intervals. 

This form of selection has general applicability. Let us suppose 
that we want to select a sample of employees of a factory made up of 
work groups which vary in size from 15 to 500 employees. Now we 
may want to satisfy two conditions: (u) To give every employee in the 
factory equal probability (say 1/30) of being selected. Thus, the ordi- 
nary sample mean will be a proper estimate of the population mean 
for the factory. (6) To obtain a constant size sample (say 10 employees) 
from each selected section. Thus, the analysis dealing with sections as 
units will be facilitated and sometimes made more efficient. Wc have 
again an equation denoting the probabilities of selection in the two 
stages that looks like this: 


V 12 » 1 

300 ^ Pi 30' 


Pi is the number of employees in the section or the estimated numlwr 
in so far as exact numbers arc not available. In the equation, t c 
number 1/30 satisfies requirement (a); the fraction 10/i»< satisfies re- 
quirement (6) cither exactly or approximately; and P./300 for the 
selection of groups is necessary to make the equation hold. 

The procedure will begin with the listing of all the sections. The 
order of listing may be utilized to yield stratification in connection 
with the systematic selection which will follow. Note that a section 
will be selected into the sample for each 300 employees. Cumulate the 
numbers of employees from section to section ending with the total 
number (12,000) in .h= population. a random /"-"J 

300 and apply the interval of 300. There v,-.ll be 12.000/300 - -10 
teleetiom, made. But a section larger than 300 may fall m.o the ample 
more than once. If a section fall, in twice, make a double seleet.on 
from it. 
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The sample mean is again the simple 'Zx/n. Incidentally, if in a 
sample of this kind 10 out of the 30 sections had a certain character- 
istic, it would be wrong to estimate from that datum that 1 /3 of all 
sections in the factory were similarly characterized. Rather, one should 
estimate that 1 /3 of all employees worked in sections which had that 
characteristic. The estimated variance of the sample mean is discussed 
in Section 18. 

Selection with probabilities proportional to a measure of size is a 
widely used procedure. With its use the size of the subsamples can be 
stabilized in cases where the cluster size varies, in so far as useful (but 
not necessarily exact) measures of size are available. The probability 
of selection through the two stages is kept constant by the use of 
sampling rates, and it is not affected by the nature of the measures 
of size or by their inaccuracies. 

This procedure is used in practice in two ways: first, intervals 
are used as shown in the two examples above; secondly, a single selec- 
tion may be made from each stratum, as in Section 21 . 


21. A Multistage Area Sample 

In the following description of the procedures for selecting the 
basic sampling units used for the national cross-section samples of the 
Survey Research Center, some details have been omitted. It is pre- 
sented as an example of the selection of a large-scale multistage area 
sample. The set of procedures used in the several successive stages is 
basically the same'. First, the popuIaUon to be sampled is defined in 
/ terms of sampling units, and the sampling units are placed into strata 
and listed in each stratum. Tlicn, the measures of size for all the units 
in the stratum are assigned and one unit is selected with probability 
proportional to its measure of size. Then, within each of the selected 
units there occurs a repetition of the processes, the defining, stratifying, 
and listing of units and the assignation of the measures of size and 
selection (7). 

The design called for a sample of the 40 million private dwelling 
units* of the United States. Wc shall discuss a basic design adequate 

•The Mmplc of private dweUingf ewludcj the roidencca of tome people, 
chiefly hotels and Inidtutions. They could be Included by suitable procedures In a 
separate stratum. The 40 million figure came from the 1940 data used for drawing 
this sample. 
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for a sample of about 4000 dwellings, a sample in which every dwelling 
is to be given a 1 /10,000 chance of being selected. But it is understood 
that the same materials can be, and arc, used over and over to yield, 
various samples at different rates. Area sampling is used— the selection 
of area segments determines the selection of the dwellings associated 
with them. 

The field work is done by part-time interviewers who cover, 
usually by automobile, an area within a reasonable radius of their 
homes. An interview “load” of about 50 sample dwellings per sample 
area (for one or two interviewers) was taken on the basis of considera- 
tions of the economy of clustering (Section 14). But 50 sample dwellings 
represent 500.000 dwellings in the population, enough to populate a 
state. So that the interviewer will not have to cover an area as large 
as a state, the sample area of each interviewer was to be restricted. 
The county was chosen as a basic unit providing an economic com- 
promise between the aim of reducing the cost per sample dwelling and 
also spreading the sample over the diverse groupings in the population. 

Most sample counties contzia a city (usually the largest urban 
center in the county and near its center), several smaller incorporated 
cities and villages, some unincorporated congested areas mostly around 
the cities, and, surrounding all these, open country where farmers as 
well as many nonfarmers live. The central city is the most likely place 
to find a suitable interviewer. Usually half or more of the county’s 
population (also half or more of its sample dwellings) arc in or adjacent 
to the central city. The interviewer travels to the others (the outlying 
cities, villages, and open country segments) on roads which tend to 
radiate from her city. With an automobile she can go to one or more 
outlying sample points and return home the same day. 

Thus, the county appears as a rather desirable sampling unit for 
the first stage of selection, as a “primary sampling unit.” There arc 
exceptions to the idealized picture drawn above. Some counties arc 
too large or otherwise difficult to cover as an entity. These may be split 
into smaller units. In other cases, it is possible to build up from two or 
more counties, or from parts of them, primary sampling units which 
will have the desirable features mentioned above. From the point of 
view of increasing the spread of the sample over diverse areas, it is 
best to combine into one primary sampling unit contrasting kinds of 
areas. But to reduce the cost per sample dwelling, the amount of travel 
within the primary sampling unit is to be kept down. 
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The primary sampling unit, the p.s.u., is confined roughly to a 
county, but to represent about 500,000 dwellings it must, in most cases, 
represent much more than itself. Each sample p.s.u. is selected from 
a stratum of p.s.u.’s which altt^cther add up to about 500,000 dwell- 
ings. For the subsequent clerical w'ork it was convenient to deal with 
sampling units in terms of numbers of persons rather than dwelling 
units. Hence we equate 500,000 dwellings to 1,800,000 persons, and 
the strata are built up approximately to that measure. 

Each of the twelve largest metropolitan areas has more than a 
million population. Each of these areas, composed of a single county 
or of several adjacent counties, was adjudged to provide a large enough 
sample for a single team of interviewers. Hence, each of these areas 
comprises a separate stratum consisting of a single p.s.u. There are 4S 
counties altogether in these twelve areas, and they contain about 30 
percent of the people of the U. S. 

The other 70 percent of the people dwell in more than 3000 
counties, and those counties were sorted in 54 strata of about 1 ,800,000 
persons in each. In each stratum were grouped counties which appeared 
as similar as possible on the basis of geography, size of central city, 



Meatvxes of Uit 

■ Cumulated mtasuret 


(1940 population) ■' 

of tilt 

First p 8.U. 

10,238 

10,238 

Second p s.u. 

12,688 

22,926 

Third p.8 u. 

4,060 

26,986' 



247,769 


5,843 

253,612 

Selected p.$.u. 

19,277 

272,889 


20,487 

293,376 



1,818,589 

Last ps.u. 

13,115 

1,831,704. 

Total 

1,831,704 
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and several other characteristics (7, 13). One particular stratum con- 
tains only three p.s.u.*s, each with one of the large cities of the South- 
west. On the other hand, the county exhibited below is one of more 
than 100 p.s.u.’s that compose this stratum, largely from the rural 
parts of the Plains States. 

The selection of the p.s.u. was made from a list of all the p.s.u.’s 
in the stratum, on which were cumulated the measures of size of those 
p.s.u.’s (their 1940 populations). A random number from 1 to 1,831,704 
was taken, since this- was the total 1940 population of the p.s.u.’8 in 
the stratum. The number chosen was one of the 19,277 numbers 
(between 253,613 and 272,889), any of which would have selected this 
particular p.s.u. into the sample. 

The aim was to assign to every dwelling in the United States— 
hence also to every dwelling within each stratum— a probability of 
1/10,000 of being selected. From the stratum, a p.s.u. was selected and 
its probability of selection was 

19,277 1 

1,831,704 “ 95.02' 

What should be the rate of selection of dwellings within the 
county? It should be 

1 . I 1 

10,000 ■ 95.02 “ 105.24 

because thereby the probability of selection of any dwelling in the 
stratum through the two processes of selection is made 

95.02 105.24 10,000' 

The requirements of probability could be met simply by listing all 
of the more than 5000 dwellings * in the p.s.u. and applying the interval 
of 105.24 to it. However, that procedure was judged to be too expensive. 
Alternately, one could divide the entire area of the p.s.u., or county 
in this case, into small areas, segments in the open country and blocks 
in the towns, and apply the interval of 105.24 to those areas. That 

tWe obtain these crude estimates of dwellings, as quick aids in planning the 
field work, using a factor of 3-1 /2 or 4 persons per dwelling for the total of 19.277 
people in the p.s u. 
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would be a feasible procedure in this largely rural county and suitable 
for some surveys. For our surveys it was judged that the town blocks 
with about 6 to 12 dwellings were larger than we desired. Therefore, 
it appeared that some subsampling was to be done within the p.s.u. 

Maps, air photographs, and census data were procured to see 
what the county contained. It was decided that travel to a town \vas 
to be made for not less than 6 sample dwellings. This minimum for 
the sample yield in the towns (the secondary sampling units) is an 
economic consideration similar to the assigned^ yield of about 50 
dwellings for the primary sampling units (Section 14). It means that a 
secondary stratum in the county should contain no less than roughly 
6/50 = 12 percent of the county. Thus the county was divided into 
the three strata given below with the 1940 populations for measures 
of size. Each of the strata is to be sampled with a rate of 1 /105.24. In 


Composition of a Sample County 


Slraium 


Meofure oJ 

Cumulated 

nmher 

Dtsenptton oj ttratum 

sizt {1940 pop ) 

meaaute oj 




iiv 

I 

Central city, only urban place 

3920 

''f 

n 

All smaller incorporated places 

872 

872 



584 

1456 



447 

1903 



443 

2346 



357 

2703 



224 

2927 



171 

3098 



3098 


III 

Open country (the remainder) 

12,259 



Total for County 

19,277 



dividing the wunty into strata and the strata into sampling units, the 
principles of listing must be observed; any area must appear in one and 
only one sampling unit (Sections 5 and 6). The division must be made 
clearl) and consistently before selection takes place. 

Stratum I consists of the one dty. It was sampled in the two stages 
of blocks and dwellings, as described in Section 19. The product of the 
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probabilities of selection for the two stages was 1/17.54 X 1/6 = 
1/105.24. Altogether, the probability of selection of a dwelling in that 
city depended on three selections: 

95.02 17.54 ^ 6 10,000' 

In Stratum II it was desired to restrict the expected yield of about 8 
dwellings to one of the 7 villages which composed the stratum. The 
random draw from 1 to 3098 selected the fourth village with a probability 
of 443/3098 *= 1/6.993. In that town the rate of selection had to be 
made 1/105.24 -5- 1/6.993 = 1/15.05. This was done again in ihe two 
stages for blocks and dwellings as 1/5.02 X 1/3 = 1/15 05 Thus a 
dwelling in that town was selected in four stages (county, town, block, 
dwelling), and the product of the probabilities is: 1 /95.02 X 1 /6 993 X 
1/5.02 X 1/3 = 1 /1 0.000. 

All of Stratum III, composed of the open country surrounding 
the other places, was divided into area segments.^ There were 736 of 
these segments with an average of about 4 or 5 dwellings per segment. 
The interval of 1/105.24 is applied directly to those segments, and 
every dwelling In the selected segments is included in the sample. 
Thus, about 7 segments, with about 30 dwellings in all, are expected 
from the open country. The selection of a dwelling in the open country 
is done in only two stages: 1/95.02 X 1/105.24 = 1/10,000. 

The work performed in this county was similar in nature to work 
performed separately in each of 66 primary sampling units The pro- 
cedures w ere suited to the circumstances found in the primary sampling 
unit. It should be noted also that the expected numbers of sample 
dwellings arc not “quotas.” The selections arc made with sampling 
rales, which arc numbers controlled to yield sample dwellings w'ith a 
fixed probability. In so far as the actual populations are different fi-oni 
the measures of size, there will be variations in the yield of the sample 
from different areas. 


22 PROCEDURES OF ESTIMATION 

\Vc ha\c cxaminca in this chapter the role that some practical 
methods of selection can play in improving the sample design. The 

tXhli duiiion, as well as otlicr useful materials, may be bought cheaply from 
the Census Bureau and is part of the Matter Sampte. 
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sample result with which we dealt (except in Sections 11 and 17) was 
the simple mean of the sample Sx/n. But in all cases we understood 
that the sample mean is of interest only in so far as it serves as an 
estimate of the mean of the population which the sample represents. 

Sample design was defined earlier as dealing jointly with pro- 
cedures of selection and of estimation. The precision of the estimate 
based on a sample can often be improved— sometimes considerably— 
through the use of available auxiliary information. The topic is too 
complex for this chapter, but it is important, and the attention of 
the reader is called to it (1, pp. 111*188; 22, pp. 145-182). 

For a simple illustration, let us say that the simple random sample 
of 400 out of 12,000 employees has been selected (as in Section 1). 
Suppose that we have the information on the proportions {Nk/N) of 
employees that belong in the several classes of some variable. Although 
we did not use that information for stratification during the selection 
process, after the data have been collected we want to use the informa- 
tion to improve the sample mean. The question may arise: why was 
that information not used in the first place for stratification during the 
selection so as to obtain a proportionate stratified sample from each of 
those classes^ Well, perhaps the sampler did not think of it at the time 




Total 

Union 

Xonunton 

Number in class A 

Nk 

12,000 

7200 

4800 

Proportion in class A 

lt\ = Nk/N 

1.00 

.60 

.40 

Numben in the sample 

Rh 

400 

260 

140 

Numbers of “>cs” ans^sers 

Ph'ttK 

80 

26 

54 

Proportion of "yes" answers 

ph 

.20 

.100 

.386 


of selection. Or he had so many strata on the basis of other variables 
^at he just could not use it for stratification in the Selection process. 
Or, perhaps the data were not available beforehand. 

Let us assume the last of these possible causes. Suppose that the 
union tells tis that of all the 12,000 employees in the factory 7200 
belong to the union. But union membership is not indicated on the 
payroll cards used for the selection of the sample. Suppose, also, that 
from the union, or from the respondents, wc can find out who of the 400 
in the sample belonged and who did not. It is necessary that the 
definition of belonging as ascertained fw the saiAple cases be the same 
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as that used to classify the population into classes. The pertinent data 
on the weights for the two groups arc given below, together with the 
results on the number of “yes’* answers to a question from a simple 
random sample of 400 employees. 

The proportion of union members in the population is W^^Ni/N 
.60. With proportionate stratihed selection, .60 X 400 = 240 of the 
sample of 400 would have been selected from these; actually 260 were. 
This is not a large deviation, but the chances of getting a larger 
deviation are only about 1 in 20.* 

Now the ordinary mean p' — 80/400 is, in effect, one where the 
strata are weighted in proportion to the representation in the sample. 
That is, 

I ’’ + x = .200. 

Ho%vcver, if we use the information we possess about the correct 
weights of the strata, we have 

V ■= -60 X .100 + .40 X .386 = .214. 

Wc see that the correction is not great in this case, particularly in light 
of the standard error of .02. 

It is more productive to look at the matter in probability terms, 
as wc did in the case of the proportionate sample in Section 9. The 
weighted mean pj in thb case has a variance about 2 percent less 
than the variance of the unweighted mean p'. That is, by weighting 
the sample of 400 we make it as accurate as a sample of 408 would be 
without weighting. The gains in precision obtained by this kind of 
weighted estimate arc just about the same as they would be by propor- 
tionate stratified selection procedure. This should not be surprising in 

tit must be noted that this last fact is in sharp contrast with the kind of 
adjustments by which conformity to population proportions is forced on data 
obtained in some hapharard fashion. The latter may have proportioaa groaly 
different from the population; hence the “corrections” may be very drastic. That 
that the “corrected” mean may be very much closer to the population mean 
than the unccrrectcd mean. However, because the data do not arise from a prob- 
ability sample, statistical theory will not bridse the gulf froi- sample mean to 
popuUtion mean. The conjecture must be mad^ by expert judgment. It is even 
possible that the adjustment took the “corrected” mean farther from the desired 
population mean. Furthermore, no probability can be atuched to the occurrence 
of this undesirable event. 
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view of the fact that the method of estimation in the two processes is 
the same, except for the differences in the number of actual cases 
obtained in the vanous strata The gains are small, as in the propor- 
tionate sample of elements, and for the same reasons 

The above is but a simple illustration of a procedure for using 
auxiliary information in the estimation process m order to reduce the 
variance of the sample estimate It is a poor illustration in that the 
gain in precision is very small There are other estimation procedures 
applicable to other situations by which some large gains in precision 
may be made The two most important are called ratio estimates and 
regression estimates In some instances, the gams made through the use 
of better estimates are spectacular (1, pp 111-188,3, p 87,22, p 155) 
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part III 


Methods of 
Data Collectm 


In terms of a gross division, there are onjy three methods 
of obtaining data in social research: one can ask people questions; 
one can observe the behavior of persons, groups or organizations, and 
their products or outcomes; or one can utilize existing records or 
data already gathered for purposes other than one’s own research. 
The last three chapters in this part describe techniques to be used 
in connection with each of these broad classes of methods. The first 
chapter is a theoretical analysis of the problems of measurement. 

In social psychology, emphasis is placed upon methods of data 
collection in which the researcher himself introduces the specifica- 
tions and the controls, and the chapters on interviewing and be- 
havioral observation address themselves to this problem. The useful- 
ness of existing records for many research purposes, however, is 
demonstrated in Chapter 7. 
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CHAPTER SIX 


Problems of Objective 
Observation 


Helen Peak' 


All sc.en..su ha^e as an .deal the objecficanon of then 
methods and techniques That is to say, they aspi , ’ 

record, and interpret events m such a fas ion “ , 

observers can verify their findings The r ^ , annlv to 

ures" has been used rather narrouly by Py^hologists to app^ to 
certain limited kinds of "tests" which are ^ 

procedures for administration, for scoring, an or p 
of the scores on the basis of their reliability and ™ ‘r- 

reference to established norms of performance restricted 

tamed that whether one speaks of standardize oo s 
sense of the term or of objective measures uith 
of scientific observation, the problems are 

revolve around the and mierpretmg 

tions and procedures for making o 

them m the framework of a rertam fundamental 

This chapter will attempt to spell ou 

^ With whose collaboration this chapter 

. I am graletul lo Eugene material and for tin many 

Has ongmally planned for a careful r g helpful at sarious stages of 

useful siicgestions Manj other colleagties -diiors of this solume and C. If 

the revision 1 wish to meiuion parlicularly tne 
Crmmbr D W Chapman E L kelly and Earl Carlson 



It is difficult to draw the line between what is and what is not 
methodology relevant to research in social psychology' and related 
areas. Projective techniques, for example, ordinarily considered in 
the realm of "clinical psychology" have been used to good advantage 
on social-psychological problems. Even techniques which belong 
in the realm of physiological psychology can be used in research on 
attitudes and group behavior. It was obviously neither desirable nor 
possible for this book to attempt to cover so wide a range of measure- 
ment techniques. The following chapters attempt to describe not 
all the methods which might conceivably be employed in social 
research but rather those methods of measurement which the person 
engaged in research in this area cannot avoid. 
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pose, they must be constructed to specifications In either case, 
whether it be a matter of selecting an appropriate measurement 
deuce or of building one to order, certain criteria are operating 
and certain questions are asked To make clear what these questions 
are and to summarize and criticize some of the answers to them is 
our present task 

It will be consenient to oi^nize the discussion around six basic 
problems 

1 What behavior is to be selected and recorded m order to obtain 
the information required? 

2 Under what conditions are observations to be made? How is the 
observational situation structured? 

3 What evidence is there that some process with functional unity 
IS being observed^ 

4 Has an attempt been made to summarize i\hat is observed in 
quantitative terms’ Can a score be assigned, and what are the 
metrical characteristics of that score? 

5 What IS the nature and meaning of the process which has thus 
been observed or inferred? How is it to be labeled’ What is its 
validity? 

And, finally, how stable are the observations? Can the same 
results be obtained under what appear to be the same conditions? 
Are the measures reliable^ 

The first two of these questions will be considered together in 
the following section, ‘The Observational Situation and the Selec 
lion of Significant Behavior Questions 3 and 4 will be discussed 
together in the section on ‘ Evidence of Functional Unity, because 
the demonstrations of functional unity are closely related to prob 
lems of scoring Questions 5 and 6 will be treated m separate 
sections 


THE OBSERVATIONAL SITUATION AND THE 
SELECTION OF SIGNIFICANT BEHAVIOR 

Something is to be observed and measured, and the investigator 
*hust start with hypotheses about the v\ay m which this process or 
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questions that must be answered regarding instruments and proce 
dures which purport to be objective, and will undertake to discuss 
some of the techniques that have been worked out to answer these 
questions Space limitations have forced us to be selective in the 
choice of techniques which are esaluated and this selectivity has 
entailed the omission of many approaches which might well have 
been included Furthermore, since many of the specific tools for 
the measurement and observation of social psychological variables 
are gi\en detailed attention in other chapters of this volume those 
used here for illustration in connection with answers to our basic 
questions have been chosen because they represent examples of 
important instruments not discussed elsewhere in the book A1 
though this criterion has resulted in an emphasis on individual 
rathe* than group processes, the questions discussed are relevant 
to all observational situations The problems are general and basic 
w leiher we are dealing with group process and structure, with 
mteraclion or with attitudes, motives, traits, and similar variables 
uri ermore, t ey arise with respect to all sorts of procedures for 
observing these variables interview techniques, participant observa 
othe questionnaires rating scales tests and many 


The problems to be discussed in this chapter will recur in 
any parts o a volume devoted to the analjsis of scientific methodol 
psychology In so far as the other chapters are con 
observational methods applicable to specific problem 
nuf<t, *^oy do not provide a general statement of the 

ihp observation and measurement This is the task of 

Chapter 11, in which Clyde H Coombs 
servation^**^ ‘uiself to basic problems of measurement and ob 


the problems of observation 


of instnimpuT ®ut to observe an event some choice 

tcchniniip^ * procedures must be made If instruments and 
charactenit^*^^ ^ hand, questions will be asked about their 
characteristics and utility If none are available which fii the pur 
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lions to be given, time limits to be used and the questions to be 
asked Perhaps the failure to cope moie adequate!) with otlicr 
aspects of this pioblem stems from the implicit assumption that tlic 
processes being measured are independent of most variables and 
therefore relatively static and stable, an assumption which is clearly 
false for such processes as attitudes, needs adjustment mechanisms 
interaction, and other group phenomena Here it is obviously just 
as necessary to have some notion about the existence of uncontrolled 
determinants as it is to know whether a patient has walked up a 
flight of stairs before taking a metabolism test 

Ideally, then, objective observation requires a clear indication 
of the behavior to be selected from the complex matrix of activity 
and definite specifications of the conditions for eliciting this be 
havior in such a way that inferences may be made about the 
variables involved But it should be noted that there is no simple 
methodological prescription for meeting these requirements They 
are the products of the theoretical sophistication knowledge, ong 
inality, and hard work that are demanded in older to build theories 
about interrelationships between events and to test those theories 
empirically To identify the important variables to think of ques 
tions to ask ways of structuring the situation the appropriate 
behavior to observe— these steps depend to an important extent on 
the flash of insight and the hunch founded on knowledge and ex 
perience with the problems under consideiation The rules for 
the formulation of questions and sniements, which are found in 
methodology texts, are simply convenient criteria against which to 
test ideas but hive nothing to do v\ith producing them And it vmH 
be apparent that here, as it many other points the theoretical model 
which the investigator brings to the task vmII play i crucial role for 
It will be a major source of the ideis which occur to him and of tlie 
choices which he makes If for example he sets out to devise a 
measure of hostility with a knowledge of the psychoinalytic thcorv 
of defense mechanisms the questions asked ind the behavior o 
served will be very different from tbai which would seem relevint 
if manifest expressions of hostility v\crc regirded is the on y ippro 
priate data 
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event manifests itself m behavior, as %\ell as the conditions under 
which such relevant behavior will appear There may be occasions 
v\hen interest lies wholly in seeing what people do (whether they 
cheat in a given situation, what answers they give to a question, what 
a group decision will be) with no desire to make inferences about 
traits or attitudes or processes More often however, inferences 
will be made from behavior to variables such as tensions, motives, 
attitudes, traits, perceptions, or group norms which are only reflected 
m behavior When this is so, the investigator will have started with 
some theory (implicit or explicit) about the way in which these 
events affect behavior, and he will have chosen to observe those 
reactions regarded as appropriate for his purpose In either case. 
It is the scientist s problem, if he is to be objective, to specify as 
clearly as possible what behavior or interaction patterns are to be 
noted 


Similarly, a siaiemeni of the standard observation situation 
must be made, including as complete a description as possible of 
the total structure of the situation for the reacting individual or 
group w at the observer does, the questions he asks or the stimulus 
uems he presents the instructions he gives, the comments he makes 
dni t e attitudes he reveals m the interaction situation The inven 
ory o conditions should also cover relevant facts about the reacting 
mdividual or social unit, particularly those psychological character 
^ IS 3" processes which might affect the events under observation, 
ini t IS wi 1 include that infinitely complex scries of social variables 
out of the interrelations of people with one another 
^ specified which of these are to be held constant and which 
Tn fK standard observational situation 

finirtinn assume that every behavior or reaction is a 

° uiany antecedent conditions the ideal of objective 
of thp requires the identification and observation of as many 
stahlp antecedents of the event as are necessarj to yield 

been standardization has typically 

of the has been for example only tardy recognition 

rhpv arp ^casures of intelligence have little meaning unless 

linn A« a under known conditions of education and motiva 

onJv gixcn to checking or controlling 

y aspects of the immediate situation, such as the instnic 
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mean that (1) they change concomitantij. (2) they are dynamically 
interdependent, or (3) one process changes dependency with an 
other (cause and effect) Most of tne methods for discovering the 
presence of functional unity deal simply with concomitant change, 
I e , they derive in large part from correlational techniques and 
consequently are mute with respect to the existence of dynamic 
interdependence or causal relations These distinctions between 
kinds of functional unity will be discussed further in the section 
on “Dynamic Organization 

The first task is, then, to describe and evaluate certain of the 
correlational procedures for demonstrating what is here called func 
tional unity This is not to inquire about what is being measured 
but rather about the existence of some process or some aspect of an 
event with sufficient integrity that it may be identified as organized 


The Problem of Scoring 

As we examine the concept of functional unity, it will become 
clear that it is intimately related to the question of scoring Some 
method of assembling the evidence from multiple items of behavior 
into a composite is needed in order to reduce the raw data to man 
ageable form and scoring is one form of summarizing It is possible, 
of course, m the simplest case, that the data will be a correct or 
incorrect answer to a single task, an agreement or disagreement isiih 
a single statement about an isolated issue, oi simply the fact t at 
a child either got into a fight or did not under some condition 
Typically, however, many behavior items will ha\e been note , 
and these must somehow be drawn together This ® 

intuitively and the result expressed verbally without benefit of 
numbers, but it is clear that where procedures for combining data 
into scores are definitely prescribed errors of subjective interpreta 

non i*nd bias are eliminated in some degree I'nrt ermore ^ 

many obvious advantages in numerical indices alwa)s ^ 

do not distort the data These problems will be discussed be ow 
In any scoring procedure several steps are 
nrst of these require that the .tents of 

combined into a score must be classifiable into ‘ “ decision 

on some grounds This ma) insohe simply ^ 

ihat all the items m a test do in fact fall m the same categ ry 
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THE EVIDENCE OF FUNCTIONAL UNITY 


The Meaning of Tuncttonal Unity 

To ask whether a test or senes of observations has succeeded 
in isolating a characteristic or event with some sort of unity is to 
raise a question that observers of human activity and experience 
ha\e been asking for a long time and in many circumstances Over 
the years, human nature and human behavior in its most inclusive 
sense ha\e been sliced many ways, but the resulting segments have 
often proved to be artifactual and without systematic significance 
Wherever discriminations can be made on the basis of qualities of 
an event or experience, or on the basis of its intensity and amount, 
or in terms of differences m relationships or on an) basis whatever, 
a line may be drawn and a "thing" identified So it is not surprising 
to find some 18,000 trait names in the well known list which Allpori 
and Odbert assembled from English dictionaries (S) The -bilily to 
make discriminations is always involved m the isolation of parts or 
segments of reality, but the reliction of this chaos of distinctions to 
scientifically useful concepts demands operational methods for dis 
covering order and organization and for testing the meaning of the 
organizations discovered We shall review and evaluate some of the 
procedures devised to determine the possible lines of fracture and 
foci of organization in the complicated matrix of behavioral events 
n the simplest case, processes or events or objects may be 
regar e as having unity by virtue of sharing some common char 
ac eristic us, a group of round objects has unity with respect 
o roun ness or you may classify responses in terms of a quantifiable 
c aracteristic, such as speed, or as leading to some consequence, 
sue as injury or support of another If events share these character 
IS ICS, t ey are in some sense, homogeneous The number of possible 
oimensions that may be described in terms of such essentially static 
simi anties is egion Such categories have their uses indeed, they 
are indispensable 

But for the systematic purposes of science it is usually more 
ruit u to ook for functional unities that indicate more than super 
aa simi arities among events To say that processes or behavior 
events are unctionally organized has one of three meanings it may 
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use with these scores? How do we best express the relative scores 
of persons when comparing them with othersr 

The answers to these questions open up the important problem 
of the relationship between the model implied in the use of mathe 
matical operations such as addition, multiphcaiion, correlation, 
etc, and the data to which the operations are applied Coombs, in 
Chapter 11, has described the characteristics of various types of 
scales, and the reader is referred to that discussion in this connec 
tion The important point is this If we impose on a set of data 
assumptions, such as the commdn one that equal steps between 
numbers assigned to reactions represeni equal intervals or mere 
ments of some psychological process, the results that emerge rom 
the mathematical operations may be determined more y t is as 
sumption than by the nature of the reality that is being measure 
If we assume unjustifiably that our scale positions are equa p^c o 
logically and proceed to manipulate the numbers in accor anc 
with these assumptions of continuity and cqua ity, t e resu i g 
means and correlations will not reflect an accurate picture o 
processes inferred to exist 

!>peafic Procedures for DeteJtmmng nmcliotml Unit) 

Ceria.n operational procedures w.ll be discussed m 
from the poin*l of view of the evidence of * 

they provide These procedures differ in a number of ^«1“» 
clud.ng the level of behavior complexity lo which they ^ 

Those to be discussed first loot, for unity and 

relatively simple behavior items, such as responses o s S I 

or siateLn^ Here we consider ihe ol 

ency among Hems by item analysis vnd bv t le ..chnioucs 

some rational order among beha>ior items b> t le sn g 

At the next level, the un.ts or segmen.s of behavior 

scores, correlations between lesls are stii i 

organization At a more complex level facior > cetnin siie 

mg relationships among intcrt«t g,. ,l„eu«cd 

cial problems of esiabbslung dynamic 

ihcsc arc typically complex organizations j ue iliall 

For evch of the methorls of analyzing I.inrt.onal unity. 
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that, for example, they all deal with Russia, or with extroversion, or 
with achievement Some statistical or experimental method of de 
termming functional unity may have been employed to give empin 
cal evidence of a common factor or dimension or characteristic In 
any case, if there are many separate events which are to be reduced 
to more simple form, an inescapable step will be categorization 
This simple "mapping of objects into symbols" results in what 
Coombs and others have called the nominal scale, though it con 
stitutes a step m building any son of scale 

At this point there will usually be some resort to numbers 
m order to express the observational results as succinctly as possible 
This typically involves the process of counting the behavior items 
which fall into a specific category the number of items passed or 
agreed with, the number of instances of sympathy, or fighting, or 
defensiveness, the number of judgments that A is better than B 
There are other possible quantitative measures, such as speed, dura 
tion, extent, and intensity of categorized events, but these have 
not been used as widely in social psychology and. although impor 
tant, do not present the same order of problem 
. questions about the rationale of counting It must 

peci le for example, how each behavior item is to be weighted 
in e counting process which yields the iota! scoie Do we assign 
one point to each statement agreed with or shall we weight some 
cavily than others? Should different ueights be given 
of assented to with different intensities? The problems 

g mg wi be considered m connection with the procedures 
functional unity (see pp 252 253, 256. 262, 273, 275) 
a thenr^^n^L*!^^ pomt, that the major problem lies in finding 

shall h(> o ^ jncaningful rationale for determining that Items 
and th different weights in the composite score, 

hot .ho “ A be found .o discover 

their effects combine to produce 

cancr ind 1''“''“"* about the metrical sigmfi 

moZna Z ° '"ve calculated a score, ZL 

renresent eoll Z" between scores 

an absohit^ ^ psychological dimension? Is there 

an absolute zero’ What sort of mathemat.cal operations can we 
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are used, weights ranging from one through five are assigned, so 
that extreme agreement with one item gets the same weig t as 
extreme agreement with another item. The score is the sum o t e 
weights. , 

Many other methods of assigning weights have been worked out 
(25, Chap. 18) (28, Chap. 7, Supp. Study B 4 and D). For examp e, 
weights may be determined on the basis of the correlation o eac 
item with some criterion to which the test is designe to 
This often means that the weights must be changed when 
to different criteria. Or weights may be selected so t at t e *^P 
sion of the individual scores is a maximum. But the i cu V 
these and many other methods is that they do not common y a 
to offer any theoretical reason concerning the manner m w ‘ 
processes reflected in the items actually work togel er o p 
the predicted consequences. . , 

These correlation techniques have serious ma cqua 
of functional unity among the items of a test, w le er ‘ 
applied to the analysis of relationships between items 
items and total score. In the first place, some of hodt^rtm 

into problems regarding statistical between vari- 

ehonc and biserial r, assume linearity of the re imnossible 

ables, normal distribution, and continuous varia cs staiisii- 

in some cases to determine whether the da.a 

ral assumptions— tor example, tlie assumption require- 

chotomously answered item ,°‘ ",s\rhen tcirarhoric 

menu may be clearly inappropriate lo t e t ^ , 

t is calculated from tables having a single c indices as "ui'- 

In the second place, the interpretation of i ^ 

ptors of the degree of functional unity |p_ „o saiisfanot) 

"istrumcnt is often ambiguous. There is. o P ,|,n p,cs- 

tsay to disentangle the combined effects decree of heirro- 

cnee of items at various levels.of dilScu'ty the 
gcneiiy of processes involved m respon g artificial limit* 

Some indices, such as phi and point biscria ' -j-jjc maxi- 

set on their size by the difficulty of the , ,|ic same projior. 

mum value of phi, for example, is unity on y , ; piii it ■> 

"on of people pass both items involved m the correl ^ 

iJlid-'*?^ lx>eM„ger (32. pp 17P) for a ''' 
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raise the following questions (I) What briefly, is the method? How 
are scores arrived at and how is functional unity determined? (2) 
What assumptions and limitations are implicit in the use of the 
procedure? (3) In what sense does the method discover functional 
unity? 


ANALYSIS OF INTERNAL CONSISl-ENCY BY ITEM ANALYSIS A1 
though Item analysis^ has been used most often as a basis for select 
ing test Items in the interest of improved prediction to a criterion, 
It may also be thought of as a device for establishing the existence 
o£ functional unity within a test In the simplest case, the procedure 
involves finding correlations between test items, using such statistical 
tools as tetrachoric r and the phi coefficient and selecting for the 
final form of the test those items with highest average correlations « 
Closely related is the procedure of finding the relationship between 
each test item and total test score This may involve correlating 

the regression of total 
^ ^ prcdiction of total scotc from item 

response) or the regression of item on the total score (prediction 
ot Item response from total score) (1) 

vanahhj “"'if' ""elation methods in which at least one 
dernrmm '‘■'*<>""..05 may be employed for 

bnerm. r ® ” 1 ."" tetrachoric r, ph. coefficient, 

u e of ,her‘"‘ T"”’’ “’O -"tnptions involved in the 

testl‘e ?a„'’b“ ‘5) But before a total 

be made ah responses a decision must 

pXe the "'-Shted and combined to 

out anv rational mi'" "''S'"' assigned arbitrarily and with 
given the same J “S'"'* wuh or passed may be 

with This has typ!Sy bee''n 1 '^“' “"d 

Likert methnH m\ ^ procedure for attitude tests The 

asked to indicate th7d7ee'!jf'a'""'‘' ■■" 

each Item If fiv/» u ^ ^ agreement or disagreement with 
'Z rl agreement and disagreement 

lechniqiies of item analys!^^ *Ch!i*" 2 n** ***** ^ detailed discussion of ihe 

5 See Loevinger fnr ■, a ” ^ * 

of prediction to a criienon whi^*invtlLJL'“5hS!***°? purposes 

tions and high correhiion* vluk * ** chot»ing items with low intcrcorrela 
Will, seletl.on loolinj l„„,m compared 

n istency of items (cquualence principle) 
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factory and ambiguous evidence regarding the existence of func 
tional unity among the items of a test Each item may be corre ate 
to some extent svith every other item and with the total score but 
thts still does not mean that all the items invohe the same process, 
even to some limited extent The existence of correlation between 
pairs of Items is a necessary but not a sufficient condition o t e 
existence of unity between all the items that make up a test ina y 
nothing IS revealed about the nature and complexity oft 
measured by the instrument made up of items thus se 
ts only evidence that some of the items are related, ut ecause 
of statistical ambiguities the extent of this relationship is 
At best, the methods are appropriate to the initial stages o a 
of this problem of functional unity or to the se ection o i 
a test to be used chiefly for prediction to a specific 

ANALYSIS OF INTERNAL CONSISTENCY BY ° concerned 

ITEMS Here, as in the preceding section, we shall 
with Item analysis But the scaling methods go beyon c 
Items and make the attempt to order the items wit P 
another-i e , to assign scale positions to them ,^,ll be 

methods proposed by Thurstone, Guttman, and L g 
examined in order to determine the contribution of these 
niques to the solution of the problem of functiona interval 

In the construction of Thurstone s cq«=> 
scales the selection of items and the proces b.ect are pre 

simultaneously Many statements about a S'™ ^ [o deven 

pared, and judges are instructed to sort the statemenu 
piles considered equal distances apart an jang g 
able statements at one end to least favorable at the otne 
R neutral position in the middle O^^e^bas.s 

possible to determine the mean or be^mterquartile range 

each statement by the sarious judges values and with 

(0) of the assigned positions Statements svi j.^nces apart are 
median scale positions approxima ^ 

SThoe have b«n chosen as iyP“ !^‘,”imM°rlanrprocedurc »t>'d' 

Ihc inporianl points lo be considered The j ^ meiric based on mol p 

bwn omnted is^mh, ordered and partwllj OcderM^ Edwards and K.l 

comparisons ishich is discussed in ' „t ,he Thoisione Liter 

painck have suggested an ingenious combinalion 
Gitliman procedures (16) 
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necessary that a discriminating test include items over a range of 
difficulty In this case the index giv« little information about the 
degree of functional unity between items or between an item and 
a total score (24, p 342) 

The size of the correlation coefficient between an item and 


total score will be affected by still another factor, the length of the 
test Obviously the relation of a single item and the total score 
will be closer for the short test than for the long test and will tend 
to decrease as test length increases Gullil.sen suggests that this 
tendency will be reversed when the point is reached at which the 
added test length increases the reliability of the total score so much 
that an increase in the item test correlation will result The relation 
ship over such a range has not been investigated (25, p 393) 

In interpreting the meaning of these correlations, we must also 
remember that no one of them can be regarded as the equivalent 
of a Pearson r, unless the assumptions underlying that coefficient 
a\e een met This being true the apparent advantages of phi 
an point biserial vanish Although they can be used with variables 
that aie not continuous and not linearly related the meaning of 
an index of a given size in these circumstances is highly ambiguous 
s already indicated, the item-total score relationship may also 
tie expressed as the regression of total score on an item or of an 
Item on the total score An illustration of this procedure is found 
sometimes called the index of discriminatory power 
( ) e total scores of those who answer a specific item in different 
7.T 17 'Whether those who respond to the item 

avorably have a total score falling at the favorable end of the d.s 
hv versa The procedure may of course, be reversed 

di<!rriKi,7*”^ average item scores of those at the extremes of the 

siirr#.#>ri ^ scores In either case wlicn the prediction 

itpTTi 7"^^ re aiionship has been shown to exist between the 

twft pvfrp ^ m so far as this can be demonstrated by plotting 
rpdnrp , ^riy distribution One limitation of this pro 

mating related items is that only part of the data is used 

croTP will ^ apparent relation between item and total 

'vhen the middle parts of the distribution of 
scores are included m the analysis 

It must be concluded that these methods usually yield unsatis 
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as in the attitude tests, or excellence of handwriting, or amount 
of threat implied in certain actions, or cohesiveness of a group, or 
anything else. The evidence that the task is a feasible one and that 
some such dimension does exist rests largely on the fact t at su 
stantial agreement among judges can be reached regar ing ^ 
tive scale position of items. Moreover, a number of studies have 
seemed to show that statements scaled by the ° 

maintain their scale positions even when persons wit ^ ^ 

ing attitudes serve as judges (19, 27, 36). But as Carter ( ) ay 
cated, this question cannot be answered once an or a ’ 
necessary to take the problem of sampling into account w en 
ing judges to be used in the standardization process. „ j 
Furthermore, scale positions have been shown to be affec ed y 
other conditions. Farmworth. tor example, 
position of items on the Peterson Attitude 
1930 to 1940 (17). More recently. Hovland and Sheri ( ’ > 

pointed out the inconsistency between the two se . j 
that assigned scale positions and judges' altitudes are ■"d^ndent 
and (2) *at percepLn and judgment =■« 
and attitude factor. They have reopened ‘I’; f 
an investigation o£ the influence of judg« method. The 

and distributions of items sorted by the Mprrro white 

judges used were Negroes, pro-Negro whites, and ant 
groups. There is evidence of marked pi ‘"S “P . in 

unfavorable categories by Negroes and pro- ® ^ however, 

the pro-Negro categories by disto^rUon of relative 

that even in this experiment there is no gr me^n^n^Iless and 

item positions. The implied a*l“ahty o ^um 
perceived distances between items xre^ y niore 

own attitudes, but roughly the same i e p-oups. The rlio 

favorable or less favorable than others anti-Negro white sub- 
between item rank assigned by Negroes and anti Negr 

jects is 0.937. . whether individuals work 

The question inevitably arises a and 

within the same frame of reference s same ordering 

whether respondent reactions give ' ne himself svas inter- 
of items as do the reactions of judges. jjvjscd sshat lie called 

ested in this problem and in the late tv-e^ on locic which i» 

“the index of similarity or relevance, 
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selected for the scale Persons who take the test check those items 
with which they agree, and the test score is the scale value of the 
median item checked 

Thurstone considered it appropriate to use means and standard 
deviations with scores obtained on these tests because he regarded 
them as interval scales (41. 43) This implies, of course, that numer 
ically equal changes in scale position at any point in the scale may 
be regarded as indicating equal amounts of change in attitude 
It must be remembered, however, that the status of this assumption 
remains doubtful for a number of reasons Even though ludges 
comtder the eleven piles as being equal distances apart, this does 
no mean t at e processes mfeired from the statements actually 
hange linearly with these perceived equal distances This places 
'‘«P0'«>bihty for interpreting not 
with ceriaiV^I'^? amount of unfavorableness implied by agreement 
In other word increments of unfavorableness 

nvo^erw^’t^' do without any knowledge of the problems 
involved, what the experts have found impossible 

lUff theor«i^l**^** should be a convinc 

relation of an Jfl.* ^ any assumption about the 

theory is wholly I "> d>e phenotypic response, and such 

Ihrs^orrobmfo'd ‘'■"= « '""““Wy dis.ortiou tn 

the scale because of the' ' 'he extreme ends of 

A score of eleven tma '*"8^ of Hems at these points 

with the one most Ltre^tlLTf an*" °h 

median Will be miii/iri ™ items are checked, the 

themselves due to what'hM'"h'*"*r“‘“^™^ 
end effect" This is the tendi"'" r'‘”°"'" psyehophysics as the 
end categories (231 and d, . k judgments to pile up in the 

'ortingLthodslLesttatT^^ 

about equal mtervah should happens Assumptions 

The dKiiinfwv, u t be regarded with skepticism 

charaaenstic of this scale is. then, that 
ing of Items and *^^*^*^*"6 the meaning and the order 

variable This mv^i value assigned to the inferred 

dtmenston to “ some common 

abstracted Th*» h ^ ®nients judged and that this dimension can be 
imension may be favorableness toward an object 
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ciemly relevant to belong in the scale must be a matter of judgment. 
Moreover, there must be as many graphs as there are items, and no 
method is provided for representing the homogeneity of the test 

as a whole. . r .u- 

It appears that Thurstone has not made extensive use of this 
technique in the construction of his scales. Edwar s reports a com 
munication from Thurstone in reply to an inquiry on t is P 
which states that he has come to feel that items s lou e se 
on the basis of factor analysis rather than by the criterion of simi- 

'"‘‘li'a'ny case, it is clear from a number of studies that the Thur- 
stone scales, when tested by the index of t;t"t'^"ty, often contam 
statements which do not maintain the scale 
judges. Edwards (15) has shotvn, for example, that the "et tral . tern 
in f number of Vhurstone scales are likely 

sons who agree with statements widely scattere analysis 

cha produces similar evidence in a study "h-h re) or the a™^ 
of the scatter of endorsed statements on ‘h' 

Toward War Scale (14), ^^“"tefend’orlements of statements 
highest and lowest scores often showe and he oues* 

tanging from strongly favorable “ wMely scatieretl scale 

tions the value of medians derived f 

'^ItiCs. . , tinif Thstracicd out by the 

The fact is, then, that the functio usually the 

Thurstone method of equal-appearn 8 . I on respond- 

same as that which is found by ^^^od whether the organiza- 

ent reactions, and the question s o,o,hods is the more truly 

non found by one or the other o cnicicnq- 

fundamental. The answer must uc g specific purposes, 

of the constructs which the arc ordered 

If, for some purposes, coiistnicls as« ^.^Hfiahle predictions 

in terms of respondent answers pro< can be rcgardcti as 

to behavior or to other consiriicts. . that judged scale 

superior for these purposes. The * useful for quite ihifcrcni 

position of siatcments may prove to ^ explored. Such 

purposes, many of which have no* example, he regarded 

consistently ordered scries of items mig * jjj^ljons of certain 
•ts measures of group norms civ inferring an aiiitud*^- 

statements rather than as .a basis for . 
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essentially the same as that adopted more recently by Guttman, 
Loevinger, and others He sought evidence of how respondent 
inswers were patterned and whether persons agreeing with a state 
ment in a given scale position accepted only statements in adjacent 
positions or whether the) checked items scattered over a wide range 


of scale values with gaps between If the items are perfectly ordered 
by the respondent, there should be no reversals in judgment (no 
Items disagreed with) between the first and the last items agreed 
with This is the basic logic of scaling when there is a single response 
to each item by each person The index of similarity was designed 
to give some notion of how each item fell into such a pattern (41, 43) « 
After statements of known scale position and low Q values have 
been selected for a scale by Thursiones method, the test is admin 
istered to a group of subjects who indicate the items with which they 
agree and disagree From these data the index of item similarity 
IS calculated The index for any item A with respect to another item 
B is based on the total number who endorse item B (n^) and the total 
num er w o en orse both A and B (na>) The index is the ratio of 
n«/n, If everyone who endorses B also endorses A, then the index is 
unity If no one endorses both A and B, the index is zero After these 
indices have been determined for all items in relation to A (the item 
iiemf ^ plotted against the scale values of the \arious 

«tiTn Similarity of the item A with respect to other items is 
nf appearance of the whole diagram If the indices 

iT^ron ^ high and If 

then ^ « T we move awaj from this item A, 

neoDle ^ because it has been shown that 

ments whirh " j statement are not likely to agree with state 
ZZ If he H f Altho^ the calcu 

clear cut and every other item is 

Clear cut and unambiguous the decision whether a given item is suffi 

rectly or asrecinff with *1** ^“niulative type it scales perfectly if answering cor 
acceptable items ha\e hppn difficult item means that all easier or more 

npc of test a person in ihl"**^^^** *oiTectIy or agreed with In the differential 
favorable items agree with t will disagree with all the less 

and disagree with ill r>vt,-o.v. "’wlerale degrees of favorableness 

terms cumulative md €Un^ ^ favorable items There will be no gaps The 
terms increasing probability iv^Vesi*"' (33) Thurstone uses the 

(41) Coombs somewhat mo?e ^ ™ax.mtim probability type test 

refer to this same distinction (sw Chap ? 1 ) * monotoni 
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reproduced from knowledge of the total test score In the cumu 
lative type of test, this will be the case when it can be shown that 
any person who agrees with a given item will agree with all less 
extreme items and will disagree with all more extreme items The 
same score cannot be made by persons who accept different state 
ments For the differential type of scale, respondents will accept 
a group of Items which are adjacent in scale position, disagreeing 
with those statements on either side of this position ^ 

The details of Guttmans method are described in Volume IV 
of the American Soldier Senes (40) Stated briefly, the procedure 
involves mapping respondent answers to each item onto a scalogram 
in which Items are placed horizontally across the top of the table, 
ranging from those most favorable to those least favorable Re 
spondents are ranked on ^he vertical margin of the table, with 
individuals giving the largest number of favorable responses at the 
top After these preliminary steps items and respondents are shifted 
in position until the maximum order possible with the given set 
of responses is discovered The ideal sought is a pattern in which 
any person A with a higher total score than any other person B will 
have agreed with items as high as or higher than D If there is 
no scale, items agreed with will scatter all over the scalogram 
without relation to the rank order of the total scores and no rear 
rangement of items or persons can reveal this kind of order 

The extent to which a given senes of items departs from the 
ideal of unidimensionality is expressed by the coefficient of repro 
ducibility This index is simply the proportion of item responses 
\\ial tan be tonetv^y hewn k.'ncr»\tnlge xA \Vie sta\e stmts 

of persons taking the lest When the scale consists of 5 items which 
have been given to 100 people, the total number of responses will 
be 500 If 50 errors are made, the coefficient of reproducibility will 
be 0 90 which is set as the minimum value permissible for a scale 
(40, p 77) 

Although tnis is the principal criterion for the existence of 
unidimcnsionaht), Gultman su^csts that at least four oilier factors 
should be taken into account in miking a judgment on this miller 
Two of these should be mentioned briefly at this time The items 
of the test should cover as wide i nngc of marginal distnbiiiions 
as possible, and items shosMiig i 50 50 distribution (i c, 50 percent 

" Sec footnote G p 25S on ciimiilathc and diiTerential tvpc lcst^ 
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That IS to say, consensus about the meaning of behavior may thus 
be determined Such information has potential usefulness in the 
interaction situation where, rightly or wrongly, we make such infer 
ences and interpretations Similarly, such scaling devices are appro 
priate for determining the generally anticipated social sanctions 
following certain actions, information that is potentially of great 
importance to social psychology 

A final point must be raised Heterogeneous measures such as 
Thurstones or Likerts (31), which do not scale in the Guttman 
sense of the word will result m a situation in which equal scores 
are obtained by persons who accept widely different items In this 
case the pattern of item responses cannot be predicted from knowl 
edge of the total score This is usually interpreted as meaning that 
the scores do not represent a measure of the same process This 
may be true but, on the other hand, it must be remembered that 
the same behavior may be determined by different patterns of 
antecedent conditions In this case there may emerge from reactions 
to an apparently heterogeneous collection of items of a scale an 
index of some process which reflects a tendency to positive or nega 
tive action and two persons agreeing with different statements 
may in fact be comparable in the strength and direction of this 
tendency to act in a given direcuon We do not know as yet that 
independent measures of the same umdimensional processes in dif 
terent people furnish the best means to the prediction of behavior 


e met °d of analysis by Guttman s unidimensional scales has 
en” ^ Its central problem the discovery of umdimensionality in 
"T ^ Items drawn from a universe whose content is 

r an escribed arbitrarily by the ini’estigator or by a group 
u * ™ay be marital adjustment opinion about 

nthpr ^ ^ esthetic qualities of paintings or any 

t , that the ordering of per 

that sample of items wil! be essentially the same as 

for thp universe of items (40, p 81) The evidence 

samnlf ° ^ found in respondent reactions to a 

nn* ^ *** opimons of judgcs about the 

alltv ^ *^ems and the essential meaning of unidimension 

rtrHpr thA ^ fact that respondent answers to test items 

mse ves in such a way that all the item responses can be 
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reached on the intensity dimension is regarded as the zero point 
which divides the favorable from the unfavorable side of the scale. 

Two comments should be made about this function. First, it 
is not altogether clear that it can establish a division between 
favorable and unfavorable attitude responses in all cases, for not 
all data show this U-shaped function and those which do typically 
have a great deal of scatter about each of the points— i.e., many 
people falling at the middle or neutral point of the scale will feel 
strongly. Second, even when there is a U-shaped curve, the zero 
point cannot be regarded as an absolute zero where there is a com- 
plete lack'of attitude or of affect toward the object of the scale. 
The determination of such a zero point, like the discovery of equal 
intervals, must involve some theory about the way in which the 
variable being measured is reflected in the observed behavior. 
There is no such adequate theoretical rationale as yet. 

It is assumed by Guttman that *'the invariant cutting point 
between favorable and unfavorable responses" can be determined 
by this method and that this point will be independent of the bias 
resulting from the specific questions asked. In other words, it is 
asserted that the percentage of persons favorable toward the issue 
or object will remain the same from one set of questions to another 
when the point of shift from favorable to unfavorable is determined 
by selecting the score corresponding to the lowest point on the inten- 
sity scale. Guttman provides some evidence that this is true, but it 
remains to be seen whether this location of zero has any validity 
in the sense of being related to a point of shift ‘from favorable to 
unfavorable action. 

The Guttman method of scaling can be evaluated from a num- 
ber of points of view. First, questions have been raised regarding 
the ambiguity of certain steps in the procedure. ^Ve have indicated 
that Guttman recommends the application of several criteria in 
addition to the coefficient of reproducibility in judging the scala- 
bility of items. It has been pointed out that the application of these 
criteria is essentially subjective and that there are no clearly pre- 
scribed rules for arriving at a final Judgment. For example, it must 
be difficult in practice to determine merely by inspection whether 
errors are random or systematic; yet this judgment is one step in 
deciding whether the data reveal the presence of a scale, a quasi- 
scale, or a nonscale type of test. Likewise, the problem of the 



262 Methods of Data Collection 

of respondents agree and 50 percent disagree) should be included. 
This is recommended in order to avoid the spuriously high coefn- 
cients which result from the fact that the reproducibility of any 
one item can never be less than the percentage of respondents falling 
into the modal category (i.e., the category in which the most people 
are found). If all test items show a high proportion agreeing or 
disagreeing, the coefficient will inevitably be high i-egardle^ of the 
internal organization of the items. This amounts to saying that 
the size of the coefficient is dependent to some extent on the 
difficulty of the items of the tesL 

• Guttman also says that attention should be paid to the pattern 
of the errors {».«., to those cases where item responses are incorrectly 
predicted from the total scores) in order to distinguish random 
errors, which produce quasi-scales, from those errois which indicate 
that there is no order among the items (nonscale types). This judg- 
ment becomes the basis of selecting items which scale and of 
rejecting those judged to belong on other dimensions. 

One of the advantages of this method is that problems of 
assigning weights to each item of a scalable test are minimal. This 
follows from the simple fact that the relative scores obtained by 
individuals would be the same regardless of the absolute size of the 
weights used, provided the weights correspond to the rank order of 
the items on the scale. When rank orders are used as weights, the 
score may be expressed equally well in a nuniber of ways: as the 
median rank of the items accepted, as the weight of the most favor- 
able or least favorable item (in the case of the differential test), or 
as a sum of the weights. The relative scores would be the same in 
all these cases. 

The Guttman scale is an ordinal scale, and no claims are rhade 
for it as an equal-interval scale. Guttman does propose to determine 
a "zero point” which separates the unfavorable end of the scale 
from the favorable end by means of what he calls the intensity 
function. After the respondent has answered a given item on the 
test, he is asked: “How strongly do you feel about this opinion: not 
at all strongly, not ver; strongly, fairly strongly, very strongly?” 
Scale scores are then plotted against the intensity scores thus ob- 
tained, and in a good many cases a U-shaped curve results. That Is, 
those who feel strongly tend to be those who fall at the extremes of 
the scale. The scale position corresponding to the lowest point 
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indicates tliat scale analysis affords a rigorous test of the existence 
of a single meaning for an attitude area in the eyes of the respojid- 
ents. This is an acceptable interpretation, but Guttman goes on to 
make the further assumption that wlien a scale is sho^vn to exist 
among items, the universe of vhicli the items are a sample will also 
scale, as will any other set of Items from that universe. On logical 
grounds this statement must also be accepted, hut this is not the 
same as saying that we can in practice identify that universe froni 
which scalable items come. Gntiman says that a universe must be 
defined in this fashion: 

^An attribute or item belongs to the universe by virtue of its 
content. The investigator indicates the content of interest by ilie 
title he chooses for the iiinversc and .all attributes with that con- 
tent belong in the universe. . . . The evaluation of tlie content 
thus far remains a matter that may be decitieti by consensus of 
judges or by some other means (40, p. 84). 

He points out, further, that samples from many universes so defined 
do not scale or are made tip by several subscales. Items which scale 
under one set of conditions and with one population do not neces- 
sarily prove scalable in otlier circumstances. It appears, ilicrcfore, 
that the most that can be learned from the demonstration of uni- 
dimensionality is that a given set of items scales under ilie specified 
conditions and with the specified sample. Having determined this, 
we have no way of inferring directly what the univeisc is of which 
these items are a sample. Only empirical test of some theory about 
the reasons and conditions for the appearance of order among ilie 
Icems. will w.a.V>e ic vievicib; It5.w.es. 

of reference, populations) to which one may generalize the findings. 

The question may be raised witether the method is adequate 
for discovering all kinds of fundamental psychological variables. It 
may be appropriate mainly to Uie task of finding unidiincnsionaliiy 
where answers to certain questions imply definite answers to other 
questions which are obviously related. Tor example, if a person 
is asked whether he wants to stay in the Army for a career after 
the war and answers, "Ves,*’ it is reasonable to expect tltat he would 
also give an affirmative answer to the question "Are there any con- 
ditions you- can think of under whicli you might consider staying 
in the Army after the war?” In such questions the dimension 
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.elation of the sue of the coefficent “^=P"tanId" 
eulty of the items (as ^..em 

highly ambiguous ^„.„ber of imms in the 

difficulties should be employe specified precisely how 

—us of 

“'“AXmheXource of ambiguity is found in the procedure involved 

nractice to try every combination of rows and columns, a decisio 
that the’ best possible arrangement has been found must be in so 

The fact that the sue of tlic coefficient of reproducibility lull 
be a function of the speafic items selected at the P 

another important problem that has not been solved e 
the coefficient will be changed not only by the presence or absence 
of unidiraensionahty in the area but also by the fineness ^ 
of the steps between items Other things equal, when these p 
are large, there will be fewer erro*^ of prediction of ^ 

from total score and consequently a higher coefficient . 

duability than when the steps are fine As already in ica e 
Edwards and Kilpatrick have suggested the use of the T urstone 
technique as a basis for the initial choice of statements whic i wi 
have objectively determined distances between them (16) ® 

Hovland-Sherif study reported above throws doubt on the use u 
ness of this procedure for such a purpose, since the distribution 
scale Items is shown to be changed radically as a function of ju ges 
attitudes toward the issue (30, 37) If, however, the judging group 
IS chosen as representative of the sample on which the scale is to 
be tested and used, the Thurslone method might at least provi e 
an objectively defined distribution of items which could be t e 
scribed as a condition of the obtained coefficient of reproducibilii) 
It IS clear thai the problem is not unique to the Gutiman method 
but It does point up another neccssarj reservation in interpreting 
the alltgctUy objective index the rocfficieni of leprodncibilit) 

In interpreting the fneaninj» of unulimcnsionaht), Guttman 
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test {fit) is a ■weighted average of the Hy’s for each pair of items, 
adjusted so that the coefficient will equal 0 on the perfectly hetero- 
geneous test and equal 1 for the perfectly homogeneous test. 

The logic of assumes that item answers should rank two 
people in the same way that total score ranks them— i.e., the person 
who passes an item should have a higher score than the person who 
fails it. Htt is equal to the percentage of such correct discriminations 
minus the percentage of wrong discriminations. The relation be- 
tween Ht and Hu is not known. 

In selecting the items for a test, it is proposed that all the Hi, 
values be calculated and that the items with low Hf, values be 
eliminated. This follows the usual procedure of item analysis. Loev- 
inger then points out that if the heterogeneity (Hjf) is evenly dis- 
tributed over all the items, “we still cannot decide whether we 
have what Guttman calls a quasi-scalable universe or two sub- 
universes. To make such a decision we need a table showing the 
coefficients of Htj. Apparently for a quasi-scale these coefficients 
will be moderate in magnitude and fairly uniform. In case [among] 
the values of Hy [there] are some very high and some very low, 
despite uniform values of Ht„ we expect there is a way of dividing 
the items into two or more tests each of which is more homogeneous 
than the original test" (35, p. 520). Loevinger goes on to indicate 
that it is doubtful that items will often be easily separated into 
two or more homogeneous tests in this way, with HuS all close to 
unity or zero. In other words, although the calculation of these 
coefficients is straightforward and more objective than some of the 
procedures which Guttman recommends for interpreting the coeffi- 
c}ej9t ^ reprodtidbilUf, Z^rti^er is Isced with tiie diiRcuitf 
of determining what shall be considered acceptable evidence of the 
existence of homogeneity or of a single dimension. She recognizes 
this fact and suggests that it must be determined in the light of the 
purposes at hand. A test which is sufficiently homogeneous for some 
purposes will not be good enough for others. 

Loevinger has attempted to rule out the cfTects of difficulty 
on her coefficient of homogeneity by the device of dividing the value 
by 1 — Pt. This amounts to expressing pt/f — as a pro- 
portion of the maximum %*alue it can attain at the given level of 
difficulty of the item. As a result, J/y can vary from one to zero at 
any level of difficulty. This constitutes some gain over the am- 
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refiected may be simply the respondent’s need to be logical. Although 
not all the unidimensional scales reported by Guttman are made 
up o£ such manifestly related questions, this does tend to be the 
case. How much beyond the discovery of obvious logical patterns 
this method of scaling can go in practice is yet to be determined. 

Another serious limitation of the method must be mentioned. 
Coombs has pointed out that if a test is made up of items which 
scale in Guttman fashion, the opportunity to discover deviant re- 
sponses will have been eliminated (10). This is another way of 
saying that perfect scales in the Guttman sense will be found only 
tor psychological processes which are common to a population. Areas 
yielding unique patterns tend to be eliminated in the scaling 
procedure. 


The logic behind Loevingcr's technique ot homogeneous tests 
ts essentially that which Guttman employs; it assumes that on a 
perfectly homogeneous cumulative type ot test a person who passes 
the more difficult ttem or agrees with the less popular item (accepted 
y fewer people) should also pass the easier task or agree with the 
^ The method makes no assumptions about 

1 “u “ “ppeopriate for qualitative data. Its use is re- 
h? w ‘ . I '-'''"' The essential difference in 

in the ind;^ niques is ound, then, not in the logic of scaling but 
anti in Indicating the amount of homogeneity* 

untlimeitiir^;;^’""""’ o' 

ceneitv*^F indicw are proposed for describing homogeneity: homo- 

homogeneityX, wS Wh' '*'• ’’“““S'""*')' W hem-test 

item and o / • • c ' ^here p. is the probability of passing the ith 

P^nethelb" hh item amolg those 

passing the ,th Item (, is a more difficult item than ,). then 


If ther • 

diffici!l7i!^p^“ui''e°S"o‘'’' 

is comoletf ^ “ equal to 1. When there 

^ ogeneity, it equals 0. Homogeneity over the whole 

ihe lamc »cn*eaj^m^anV.^ "homogendty- to ■*icaUng:'*^nd mes it In much 
man utes umdinicnjlonality" (35). 
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Whether such organization has any relation to the functional unity 
implied by scaling of respondent answers is a problem for empirical 
test The a\ailable evidence does not suggest close correspondence 
between scales constructed by these different methods 

Scaling methods provide a more systematic and rational method 
of studying the organization among items than do traditional item 
analyses \noiher advantage is found m the fact that sNhen items 
are shown to scale, a rational method of weighting the items is 
provided As long as the weights assigned are equal or increase 
or decrease consistently with tlic scale position of the item, persons 
obtaining a given score will do so by answering the same items 
and a change in weights will not change the relative positions of 
scores 

The following important limitations of scaling methods must 
be kept in mind 

(1) The confusion between the effects of difficulty levels and 
of heterogeneity of process on indices of unidimcnsionality or 
homogeneity has not been wholly eliminated The ambiguity which 
this introduces into the micrpieiation of the coefficient of repro 
ducibiltiy has been mentioned Loevinger s attempt to correct for 
tins effect yields an index varying from one to zero but the proper 
ties of the index between these valtics is not clear 

(2) It must be noted lint in these as in all other mttliods ol 
investigating interrehiions betv\ccn items or tests the effects of 
variable error cannot be sepamed from the effects of heterogeneity 
of processes actually involved in the reaction This will be discussed 
in the section on reliability 

(8) The functional unity reflected in a set of scaled items is 
scry esulcntlv dtpendent on such conditions as the samjilc used aru! 
the siructuic of the testing siiiniion A scale which exists with one 
sample and under ccrt'iin conditions rmy disappear with a new 
sample and altered conditions/a fact which recalls the siateintiu that 
the staiidird meaning of any instrumcm must be defined wnli 
respect to a set of clearly spccifiet! determinants 

( J) Scaling alone tells nothing about the character o’ compltxiiy 
of the processes vslucli are responsible for the obsencti organization 
The observed functional units may be due to a complex of factors 
varying together under the sjiecificd condiiions of ilic irst or it m i\ 
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bignities of the coefficient of reproducibility with respect to diffi- 
culty of items. 

These coefficients, like Guttman's, will vary with populations, 
with situations, and witii sets of items. Moreover, the same questions 
may be raised about the limits w.thin which the scaling method 
is appropriate. To what extent will such scales be found for psy- 
chological processes that do not involve agreement with logically 
related statements or behavior? And is there any way to apply the 
same logic to unique individual processes not organized along the 
same dimensions in different persons? 

Loevinger rejects Guttman’s notion that scaling procedures 
discover anyUiing about the scalability of a universe of which the 
lest items are supposed to be a sample and insists that the investi- 
gator's judgment of what constitutes such a universe is not to be 
trusted; that scale analysis should rather be regarded as a method for 
the objective definition of psychological characteristics. 

I-»nally, in a discussion of the lelation of her techniques to 
ac or ana ysis. oevinger makes the point that a test showing high 
mogene.ty cannot be expected to be tactorially pure; that the 

well i.v", , ° I*' ttitisfied equally 

te.u rl il^nis which measure a single factor or by 

welriiie'ri*'”' f "'hich measure an approximately constantly 
«e,g ed sum o factors. In other words, the dimensions single.! out 
1 ‘""'ucible. unanalysable processes, hut 

neonle ^ . processes wludi act consistently in different 

people under certain conditions. ^ 

of functio.TaT"'’''- discussed find evidence 

tests are ruiniii »■ " ^spondeni answers are the data and the 

in the fact that* Tf Loevinger), orderliness is seen 

regarded a* h ^ ^ 

to D iust as ^ position which must be touched on the way 

ond.\v’hcn hiHtrp ''** estone must be passed enroute to the see- 

10 some dimen^'^ ^gfec on the position of statements with respect 
•sugrema ommon" (Thurs.one scales), this 

perception of a dimension on the part of judges. 
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The “split-half* and "alternate forms*’ methods of determining 
"reliability” become special instances of measuring functional unity 
by intertest correlations. A deliberate attempt is made, of course, 
to devise “comparable” stimuli or items, which presumably tap the 
“same” process, but the evidence that this has been accomplished 
rests on the correlations obtained. The same correlations become 
evidence of reliability. There is no satisfactory way of separating the 
effects of the differences in test items in two test forms from uncon- 
trolled conditions which introduce random variation into the 
responses (unreliability). The isolation of the latter is the purpose 
of reliability tests, which are discussed in a later section of this 
chapter. 

A number of familiar problems again emerge as limitations or 
as sources of ambiguity in interpreting these intertest correlations 
as evidence for the presence or absence of functional unity. Here 
again the effects of difficulty level (or frequency-of-agreement level) 
of the test items may affect the correlations and obscure, or enhance 
misleadingly, the true relationship stemming from functional simi- 
larities in the psychological process underlying test responses. For 
example, if one test is made up of items with which a large propor- 
tion of the sample population agre^ and the other test has less gen- 
erally accepted items, the correlation between the two tests will 
be reduced by this fact. The coefficient cannot equal unity even 
though the same processes are common to the two tests (6). But under 
other conditions misleadingly high correlations may result from a 
confusion of the effects of heterogeneous process and different dif- 
ficulty levels. If a sample of children six to twelve years of age were 
given a reasoning test and a test oI motor slciils, each made up of 
items covering a range of difficulty, the positive correlation that 
would result would not imply any integration or interrelation of 
tlie processes of reasoning and motor skills but would be a function 
almost exclusively of the fact chat tasks of greater difficulty fall 
within the competence of older children whether the tasks involve 
reasoning or motor response. This calls attention to the importance 
of knowledge about and intelligent selection of samples when inter- 
preting the meaning of intertest correlations as indices of functional 
unity. It also suggests the desirability of equating instruments for 
difficulty before testing their intercorrelations. 

The sample populations to which the tests are given may vary 
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'■ h., !«,. .h— by* 

,„rh methods will be successful in discovering the kind ot proces 
organisation which is reHected phenotypically ^ ^ 

, 10 ® bear an obvious logical relation to each other. Mi cover the 
functional unity found by these methods can be f ° J. 

when the same kind of organization is common to the whole popu 
lation tested. Perfect scales are not obtained when there are unique 
patterns among those in the sample- ^ 

(6) Finally, it must be asked whether the analysis of complex 
psychological processes into their unidimensional components (Gutt- 
man) is necessarily the best and only method of observing an 
measuring them. It is probable that an Inferred process, such as 
favorableness or unfavorablcness toward some object, may be re« 
fleeted in different ways in different people and that a comparable 
'.nount of favorableness or unfavorableness may have its source 
in different processes in these various persons. If this is true, the 
heterogeneous test (Likert, Thurstonc) might best serve the purpose 
of discovering the strength and direction of the altitude complex as 
a determinant of behavior, 

INTERTEST CORRELATIONS. The literature contains many instances 
ot the familiar technique of seeking for communality of process by 
correlating scores made in different situations or on different tests. 
To take a recent example, Adorno, Frenkei-Brunswick, Levinson, 
and Sanford (2) report relatively high degrees of relationship be- 
tween tests of various kinds of ethnocenirism (E scale), and between 
elhnocenirism and a lest of "fascist” characteristics (F scale). May and 
Hartshome employed the same procedure in their futile search 
for a general trait of honesty (26).* 


• When we speak of intercorrelalions as a method of demonstrailni; lunc- 
uonal unity, u is obvious chat the many dtiTerent correlalionai prMcdures rest 
on essentially the same logic. The tpecul problems of methods such as Pearson r'r 
chi square, analysis of vtrjanre, etc, are discussed in standard statistical tvorks 
It should also 1/c noted tlm communality of process may be sought within 
the individual as well as between lots. See disnission in this chapter, p 276. and 
Catiel] for other |attem> of co-variation <8. 9} 
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may not accurately represent the relationship between the variables. 
But, as will be shown later, the ultimate test of the validity of any 
construct and of the measures which enter into its definition is found 
in the utility of the construct in the process of reducing the matrix 
of events to some meaningful order. If constructs derived from cor- 
relations prove to have value as parts of a dynamic system, this 
suggests that the mathematical model is appropriate in some degree. 

It has been indicated that a major problem in the construction 
of any instrument must be faced when rombining responses to n 
number of items to produce a total score. The problem must be 
recalled briefly at this point because the particular solution selected 
in any given case will affect iiuertesi correlations and thus the evi- 
dence of communality in the tests. The earlier comments on pro- 
posed solutions to this problem are relevant in the present context. 
Only one additional point need be elaborated here. It was suggested 
that most methods of assigning weights lack a theoretical rationale 
for determining the manner in which items of observed behavior 
should be combined to produce a score. The point is this; the satis- 
factory determination of weights to be given items in a test and the 
manner in which they are combined to produce a score should rest 
on knowedge of how the processes tapped by the items combine 
dynamically to produce effects. Coombs (13) has summarized con- 
cisely the svay in wliich distinguishable processes may enter into 
combinations. (1) Processes may be additive, so that a greater quan- 
tity of any one will produce increased effects. (2) They may be con- 
junctive; that is, each must be present in some minimum anmunt 
before any effect is produced and no amount of one will compensate 
for the absence of the other. For example, both knowledge and 
motivation arc necessary in solving a problem. (3) Tiiey may be 
disjunctive. One does not occur if the other occurs. This is best 
illustrated by the substitute mechanisms— uts., hostility may be 
expressed by reaction form.-Jliort, by direct aggression, or by dis* 

. placed aggression. This means that some items may properly be 
added, but iltai others which arc not additive must be combined in 
such a u'ay that they represent the result svbicb actt/ally emerges. 
Until atialyscs of this kind arc possible, the arbitrary scores obtained 
by coml)jning test items base limited uses and tlie meaning of the 
rorrrl.iiioni of such tests is ambiguous. 

Wltcn all these problems have !>ecn taken into account, an 
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m two ways-in the qualitative characteristics oE persons sampled 
and in the distributiOT of scores on these 

correlations may be adected by either dilference The California 
study (2) Illustrates both these ellects Correlations be™een ethno 
centrism and political economic comervalism ranged from 014 
San Quentin prisoners to 0 86 tor a group of svorking class w omen 
It IS clear from the data available that correlations are higher for the 
populations having larger standard deviations, but there is also evi 
dence that the site of the coefhcicnts is related to the kinds of groups 
sampled The authors suggest, tor example, that the low correlation 
between ethnocentrism and political attitudes m the prisoners may 
be due to the inadequate frame of reference which these men have 
as a basis for evaluating political and economic events (2, p 836) 
This means then that evidence of communahiy of process does not 
provide conclusions generahzable beyond the samples used unless 
there is empirical evidence or a theoretically grounded basis for 
knowing what the conditions of generalization to other samples are 
Here, as elsewhere the degree of relationship indicated will be 
a function of the specific statistical procedure selected for express 
ing the relationship This selection should be made m the light of 
the chaiacienstics of the data and the uses to be made of them 
If the data are essentially qualitative in nature, the relationship 
between the tests may be expressed simply in terms of the proportion 
of individuals correctly classified in certain categories of test B on 
the basis of knowledge of their classification in test A 

The various types of correlation coefficients or regression equa 
tions will express the relaiedness, it the data are quantitative and if 
the statistical procedures are appropriate It is possible, as a rule, 
to determine whether a distribution of test scores conforms to the 


assumptions of normal distribution, linearity, and homoscedasticity 
which are implicit in many of the correlation techniques But it 
should be noted that even though these assumptions are met for 
distributions of scores expressed m arbitrary units this does not con 
stitute proof that the variables reflected in these scores have the same 
characteristics This is to say that an unknown amount of distortion 
IS introduced into measurements which employ a mathematical 
model to deal with variables that do not have the characteris'^ics 
of the model If the test scores misrepresent the variables involved 
in a correlation the coefficients of correlation based on these stores 
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lion, the following assumptions basic to methods such as those of 
Spearman, Thomson, and Thurstone. (1) When the factors con- 
tributing to a test score are uncorrelated, the correlation between 
the tests is equal to the sum of the products of the weights of those 
factors common to the tests. (2) The component processes combine 
by simple addition to produce the response made to the test. (3) 
Every person taking the lest is assumed to possess every factor. (4) 
Finally, the weight of any factor on a test is tlie same for every 
person. 

Whether the factors resulting from an analysis will be correlated 
depends on an arbitrary decision by the investigator to use or not 
to use orthogonal dimensions. If he does so, the statement made in 
the first assumption will follow from the nature of the mathematical 
relationships. The fact that it must be decided whether to regard 
the factors as correlated illustrates an important aspect of any of 
the attempts to discover functional unity. Although the data them- 
selves set certain limits on what can be discovered, the hypotheses 
which the observer brings to the analysis will also determine what 
Is found. Spearman and his group have argued that the factors 
should be uncorrelated, whereas Thurstone, among others, has 
insisted that factors which are the result of the complicated and 
interrelated processes of growth and experience should logically 
show some correlation. Since either type of factor may be found with 
the appropriate methods, the decision to seek one or the other must 
be made in terms of hypotlieses which are independent of the process 
of factor analysis. 

Although the second assumption— that componciU processes 
combine additively— apparently yields the same results as would a 
multiplicative assumption (38), a question may be raised regarding 
the adequacy of factorial methods for discovering all the kinds of 
organization in which we arc interested. It may be plausible to 
assume that abilities summate, but it is not clear that this is a 
justifiable basis for predicting the combined effects of all variables. 

It is equally reasonable to assume relationships of conjunctivity 
or disjunciivity. In any case, a plausible theory is needed before 
simple additive assumptions can be justified. 

It is difficult to obtain clcar-out evidence regarding the justifi.a- 
bility of the third and fourth assumptions— that every person talking 
a lest possesses every factor, and that the test has the same loadings 
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obtained correlation between two tests simply demonstrates that some 
functional unity exists, within the limits set by the size of the coefR- 
cient and by the ambiguities of its interpretation. From this infor- 
mation alone it is impossible to say what the common aspect is and 
whether it is simple or complex. Moreover, no method is provided 
for isolating and measuring the homogeneous process, whatever it 
is, for test scores do not rank persons with respect to the common 
factor unless the correlation approaches unity. 

To summarize, correlation between tests has often been used 
as evidence of functional unity. The existence of correlation does 
not, however, illuminate the nature of this underlying process, its 
simplicity or complexity. Only when these indices approach unity 
(with allowance for unreliability) is it possible to rank persons in 
terms of the common process. 

All such evidence of functional unity must be interpreted in 
the light of the limitations set by the following facts: (1) Correlations 
are influenced by the difficulty levels of the tests correlated. (2) Cor- 
relations are a function of the sample tested. (5) The statistics 
expressing covariations must be appropriate to the test data if they 
are to reflect the true relationship between the variables being 
measured. (4) The weights assigned to lest items often affect cor- 
relations and should be determined by knowledge of the manner 
in which the psychological processes tapped by the items combine 
to determine a composite result. 

FACTOR ANALYSIS. It WES inevitable that sooner or later the 
attempt should be made to reduce the accumulating mass of knowl- 
edge about intercorrelations between tests to a smaller and more 
manageable number of dimensions. The process of factor analysis 
involves a demonstration by mathematical procedures that if there 
were certain underlying processes or factors, x, y, and r, a given table 
of test iniercorrelations could be reproduced. It furnishes a state- 
ment of the number of factors needed to account for the matrix 
and the average loading of each factor on each test. 

The assumptions in factor analysis which concern us here are 
those that enter into the interpretation of the results of factorial 
studies as means of discovering functionally unitary processes. Since 
the data^ of factor analysis are tables of correlation coefficients, the 
assumptions underlying those statistics are involved at the outset, 
and this is a hurdle which is largely ignored. There are, in addi- 
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be known if we are to be able to specify the conditions under which 
the obtained factors exist 

Another problem is found in the fact that any matrix may be 
factored into an infinite number ol different components depending 
on the method used In other words» no one solution is unique 
The choice of any set of factors is, therefore, quite arbitrary and 
will be made in terms of the theories and hypotheses adopted by the 
investigator (8, pp 281/f) The factorial methods like othen, yield 
e\idence of the existence of functional unity Something is common 
to reactions in different situations The lines of cleavage are found 
in different places by different methods of analysis, and the resultant 
slices of psychological reality are not always the same The dcasion 
about which products arc best, what they mean, and whether the 
processes are simple or complex can be determined only in the light 
of some integrated theory of personality and behavior and not by 
statistical analyses alone , 

In summary, the principal puipose of factor analysis is to reduce 
a matrix of correlations to the smallest possible number of dimen 
sions in the interest of parsimonious description of the intcrrela 
tionships between the variables The factors discovered may lead 
to fruitful hypotheses to be tested bv experimental methods For 
example, it might be demonstrated that a factor in certain tests 
emerges under ego involving instructions and disappears when the 
instructions are changed Such studies would greatly increase the 
significance and scientific utility of the patterns of organizaiion 
discovered by this method 

There are a number of sources of ambiguity in interpreting 
the results of factorial studies (I) The appropriateness of certain 
assumptions underlying factorial methods may be questioned (2) 

10 Limitations of space as \teil as (he practical diflicuUies in using the 
method in its present form ha\e led i» to omit a ducussion of the special 
fenturcs of L.izarsfclds promising technique of latent struaure analysis (40) Its 
most important differena from the traditional methods of factor analysts lies 
in the faa that tt is adapted for use with nonparametne data and so asoids some 
I f the unwarranted assumptions about units of measurement impliat in the 
earlier methods It is applicable to data collected by the method of single 
stimuli LiLe other meiliods it demonstrates ont) the compatibility of the data 
utth the existence of certain underlying trnids It does not isolate these for 
measuremeni or determine what they are Coombs and his students are iho 
engaged in devising methods appropriate to nonparametne data (5 12) 
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on each factor for each person When we are dealing uith inter 
individual \auation, it appears to be impossible to do more than 
estimate an individual’s score on a given factor, and these esnmates 
have unknown probable errors (44) In discussing this problem, 
Wolfle reports the observation that different people use quite diverse 
methods of attack on the same problem This suggests that attribut 
ing the same factor weights (a\crages of individual weights) to all 
persons almost certainly produces distortion of the true situation in 
the indi\idual case (44, p 4) The newer P technique based on coi 
lelations of data from one individual's performance on the same 
tests administered on different occasions may reveal the comparative 
weights of given factors in different individuals, in so far as factors 
may be correctly identified as the same in these different individuals 

(9 pp 270) 

In addition to the problems raised by the assumptions under 
lying factor anal)sis, a number of other familiar questions must be 
faced There is, first of all, the recurrent problem of distinguishing 
the effects of difficulty level and of process similarity or difference 
Since factor analysis starts with a matrix of correlation coefficients, 
there is the possibility at the outset that the size of tlie coefficients 
has been affected in the manner described above by the distribution 
of item difficulty in the correlated tests Furthermore, it is clear 
that factorial methods may discover factors which are due largely 
to difficulty level (or popularity level) Ferguson has shown, lor 
example, that when a matrix of iniercorrelations between the items 
of a homogeneous test (by Locvinger’s criterion) is factored, there 
are as many factors as lliere are levels of difficulty in the test (18) 
t wou d be expected that where tests in a matrix represent various 
levels of difficulty, common factors would be discovered in those 
tests at t e same level of difficulty This may constitute a means of 
separating difficulty factors from others, but it may also produce mis 
leading conclusions 

Factorial methods, like others have been shown to be affected 
y t e samp es used The factors and the factor loadings discovered 
in one samp e will often be different in a new population with dif 
erem c aracteristics This is not surprising since "factors arc 
pro uce y anything that introduces correlation into a set of 
variables (44, p 25) This means, then, that the relevant character 
istics of a sample providing the data f9r the original factoring must 
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o£ correlation merely indicates that two or more events change 
together It cannot be inferred from correlational evidence alone 
that variables are interrelated dynamically in the way that every 
part of the surface of a bubble depends on every other part for the 
maintenance of the integrity of the whole, or that one event depends 
on the others in the sense that raising the temperature of a gas 
increases its volume Projection, for example, is defined conceptually 
as a relationship between a person s perception of hostility in others 
and the presence of repressed hostility in the perceiver Investigation 
would presumably reveal correlation between these two variables 
This evidence, however, would fall short of providing a satisfactory 
demonstration that repressed hostiliiy in the pcrceiver and percep 
tion of hostility in others are dynamically or dependency related 
and that there is in fact a process, projection, which can be defined 
by this relationship In the absence of control which would rule out 
common determinants of these processes and make possible their 
manipulation, their actual interdependence must be regarded as 
hypothetical 

Hovland and his collaborators (29 pp 72 73) have discussed this 
problem, noting that the independent manipulation of two processes 
IS often difficult, if not impossible This is true especially when the 
variables involved are constructs that must be inferred from their 
effects These authors were concerned with the question of determin 
mg whether soldiers motivation to go overseas was affected by their 
attitudes toward the British Specifically, the question is if you show 
movies depicting the British role in the war and find that both 
motivation to go overseas and attitude toward the British change, 
n.Vkes'e ‘hVi'f xtay fmd out wheiber thmige m avittude has been 
responsible for change in motivation Since correlation between 
variables is the only evidence, it is not clear whether motivation has 
been changed b) the movies directly or by attitude change, nor 
IS It clear whether attitude affects motivation or motivation affects 
attitude, or whether they interact To establish such relationships, 
control of the independent variable is necessary, and in this case 
impossible In the absence of such controls, we can only surmise 
about the nature of the relationships The problem is ubiquitous 
and familiar but often overlooked 

Another familiar problem must be mentioned How far can we 
go toward establishing dynamic systems in the absence of an ade 
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The solutions obtained are not unique: the factors that are dis- 
covered are a function of the hypotheses of the investigator. (3) The 
factors found may be due to anything which introduces correlation 
between variables, and this may be a common level of difficulty 
rather than a fundamental process of some kind. (4) Factors identi- 
fied are a function of the sample used and of the conditions of the 
observations. (5) Factor analysis, like the other methods, cannot 
solve the problem of isolating error variance Irom other sources 
of variation. (6) Considerations other than the procedures of factor 
analysis must enter into the interpretation of the meaning of the 
factors discovered. These will be discussed in the section on validity. 

DYNAMIC ORGANIZATIONS. In the introductory remarks to the 
section on functional unity, it was suggested that the phrase has 
three possible meanings: concomitant variation, dynamic interde- 
pendence, and dependence of one process on tlie other (cause-and- 
cffect relation). The techniques which have been described thus 
far all involve the demonstration of concomitant variation. A ques- 
tion should be raised at this point about the limitations of such 
procedures for establishing all the types of functional organization 
which are of interest to social psychology. The complex constructs 
of every science are defined in terms of dynamic and dependent 
relationships between identifiable segments or events^ Force is a set 
of relationships; the atom is likewise a system of relationships. 
An attitude is a complex of cognitive and alTective processes seen 
as related to an object, or to an object image, and cohesiveness can 
be described only in terms of interrelationships in a group. The 
actual dynamics of the psychological system are rarely known but 
are nonetheless implied. 

Two problems will be noted at this time: (1) Although concomi- 
tant variation may be expected to appear where events are inter- 
dependent or where one is a determinant of the other, the fact of 
rone aiion is not in itself sufficient to establish the dynamic or 
epen ent character of the relationships between variables. (2) A 
tlutinction should be made between correlations based on observa- 
Inicrindividual or tntergroup variation as against intra- 
mdividual or intragroup variation, for only in the latter case can 
events be shown to change imerdependently. 

The distinction between concomitant variation, dynamic inter- 
dependence, and causal relationships is a familiar one. TJie existence 
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in the same circuit, or the same individual, or the same group. The 
two techniques should yield the same results if all the relevant fac- 
tors that might affect the relation between variables are the same 
from circuit to circuit, or person to person, or group to group, but 
this is unlikely to be the case, particularly in the study of organisms 
or societies. This problem can be handled only by the study of 
organization and change within the individual, where, for example, 
the interdependent alteration in various aspects of an attitude may 
be observed, despite the fact that the conditions of the change are 
very different from person to person. 

Many psychologists have, of course, been concerned with the 
observation of individual dynamics. Clinicians and other students 
of personality often use this approach exclusively, and more recently 
attempts have been made to apply factor analysis to observations 
of variation within the single individual.^* Cattell suggests that data 
for this purpose may be obtained by observing responses of the same 
individual to the same tests under conditions which change either 
in accordance with some systematic plan or simply under the influ- 
ence of the uncontrolled factors in the situation (P technique) (8, 
9). Presumably any of the procedures discussed thus far could be 
adapted to such data to discover which of the reactions of an 
individual (to items, tests, etc.) show concomitant change, common 
factors, or unidimensionality, but the practical difficulties are great. 

Similar problems arise in the attempt to discover functional 
unities among group phenomena, though very little systematic 
attention has been given to the matter. For example, group char- 
acteristics such as cooperaliveness, low intragroup aggressiveness, the 
perception of uniform attitudes, the absence of cliques, and high 
output have been regarded as indices of group "cohesiveness.” Sets 
of variables of this sort have been manipulated experimentally and 
observed to affect intragroup behavior (21), but there is little evi- 
dence that such variables change together, so that we are justified 
in identifying a functionally unitary construct to be called cohesive- 
ness. If this were done, it would be analogous to studying organiza- 
tion within the individual. On the other hand, the correlation of 
cliaracteristics between groups is comparable to looking for func- 
tional unity in the covariation of characteristics as you move from 

tiSec the description uf other possible patterns u{ variation <8. p, 9. 
pp 28/r) 
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quote metrics, one which is appropriate to bo* the toa an^ die 
ieory of social science? Devising an W""'' 
pattiralarly difficult when the theory and data 
different dLands. As already indicated f 

for the most part fit the assumptions of ratio srales and yet m y 
theoretical concepts, particularly motivational ones, b““ 

on models that assume processes differing m energy and amount 
which cancel or reinforce each other and which are dispelled in 
substitute activities. If we are to describe such constructs accurately 
and devise standardized instruments for their measurement, deci- 
sions will have to be made regarding the precise character of the 
relationship between the component processes whose relations dehne 
such complex constructs. This is illustrated in such concepts as 
dbplacement and other Freudian mechanisms, or the convergence 
and opposition of vectors in Lewinian systems, or increments of 
habit strength, drive strength, and effective action potential in the 
Hull model. The problem is as yet unsolved, and it can be said oidy 
that the crude and indecisive tests of relations possible in the 
present state of measurement techniques do not in fact yield any* 
thing approaching a dynamics of behavior. We talk about vectors, 
forces, and energy, but we do not measure them. 

When correlations are based on data derived from the observa- 
tion of two or more processes in a number of individuals, the correla* 
tional evidence of organization of those processes reflects, in the very 
nature of the case, only the static relaiedness of a cross-sectional 
sample of events, such as one sees in a single frame of a motion 
picture. If procedures are to be developed for dealing with functional 
unities, such as motivation and drive, altitudinal structure, substitu- 
tion processes (displacement, sublimation), and group cohesiveness 
and attractiveness, attention must be given to more than a pattern 
of static relations such as are available in data on interindividual 
or intergroup variation. The parts of the pattern must be observed 
as they interact and change together. This means observing change 
in the same person or in the same group. 

In other words, the attempt to manipulate variables through 
sampling individuals or groups is analogous to establishing Ohm’s 
law by measuring resistance, potential, and current at one value level 
in one circuit and at a number of other value levels in that number 
of additional circuits. The alternative is, of course, to observe change 
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break into multiple factors when factor analyzed (20), and more 
than one factor emerges from tests scaled by the Guttman technique 
(40, p 201) Test items chosen for their high indices of discrimina 
tion do not necessarily scale, and a correlation matrix may be broken 
many ways It is important to discover when different methods 
yield the same results, but the fact that they sometimes fail to do 
so does not necessarily discredit any procedure Other things being 
equal, the decision about which method is best must depend on the 
scientific fiuitfulness of the constructs produced This problem will 
be discussed m the next section 

Generally, die methods of discovering functional unities simply 
reveal the presence of concomitant variation Moreover, a large 
proportion of these investigations involves analyses of variation 
between individuals It is suggested that the identification of dynamic 
interrelations requires the observation of change within the individ 
ual or group rather than intenndividual or intergroup variation 
But the statistics of correlation alone are not adequate to deal with 
problems of dynamic organization, even vs'hen applied to data from 
the single individual The definition of such dynamic constructs 
requires the demonstration of actual interdependence and interac 
tion between part processes and a statement of the natuie of tiicse 
relationships Establishment of such relationships necessitates the 
experimental manipulation and control of the variables in question 
and a metrics that is appropriate to the data and to the theory of 
the science 

The discussion has dealt largely with individual ps)chological 
processes because the problem of functional unity has been explored 
most extensively m this area But quesuons about functional unities, 
which are the fundamental constructs of our thinking must be 
answered for all areas of social psychology, including interactional 
processes, group structure, and related topics 


THE INTERPRETATION OF FUNCTIONAL 
UNITIES VALIDITY 

It IS obviously not sufficient to stop with the discovery of func 
tional unities The processes isolated must be interpreted and given 
meaning if they are to have use in a scientific system This is thei"* 
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individual to individual Thus, comparative studies of cultures, such 
as Murdock’s correlation of kinship systems and incest taboos (34), 
take the relatively static approach of showing that, as cultures are 
now constituted, some characteristics go together Here, again, it 
may be questioned whether a dynamics can be established from 
these sorts of data 


The Problem of runrlional Untltes 

The problems which have been discussed in this section are 
iliose that arise whei> the attempt is made to discover processes which 
can in some measure be bounded and set apart from the on going 
stream of events for the purposes of scientific observation Any com 
mon set of identifiable characteristics may be the basis of classifying 
and isolating events, but the unity discussed here is that discovered 
by techniques for demonstrating concomitant variation, intcrde 
pendence or dependence of events on each other item analysis seal 
ing, intertest conelation factor analysis and demonstration of 
dynamic relations 

There is no evidence from any of these methods of analysis that 
psychological unities are indivisible and ultimate or that we shall 
ever hate such units Moreover, the parts of psychological reality 
do not break out cleanly or stably, and the foci and character of 
organization change with the specific londitions under which re 
sponses are elicited—i e , with the instructions given, the questions 
asked, the perceptions by persons in the sample of the meaning of 
the situation, their altitudes, needs, and capacities In other words, 
the organization of psychological processes and the functional 
unities reflected in response must be thought of not as rigid, clearly 

elimited segments, like atoms or genes but rather as shifting and 
somewhat unstable events more analogous to wave patterns in a 
fluid medium or currents in a mass of air The medium sets limits 
on t e patterns that can arise, on their stability and on the forces 
t at can be generated, but a host of contingencies external to the 
met lum also helps determine the character of the patterns and the 
magnitude of the resultant forces 

The patterns revealed also vary with the methods of determin 
ing functional unity, wlucli do not necessarily yield the same seg 
ments of psychological reality Responses to Thiirsione scales often 
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the situations or responses which have been shown to be functionallv 
related This may be called face validity It will be recalled, for exam 
pie, that Guttman proposes that the universe measured by a scale be 
known by the content of the scaled items (40, pp 53 54) This usually 
means that the referent of the attitude which is the common object 
of all the Items is taken as the defining concept If the investigator 
makes up a test consisting of items about Russia, the instrument, if 
It scales, IS quite arbitrarily called a measure of attitude toward 
Russia When May and Hartshome (26) devised situations in which 
children could steal or he or cheat, these concepts were defined 
denotatively The behavior observed m the situation was what was 
meant by stealing, lying, and cheating If the behaviors had been 
shown to be functionally unitary, the^ would have been interpreted 
by their common characteristic, deceit 

The interpretation of the factors discovered by factor analysis 
may be made in similar fashion ThuKtone, using the simple struc 
ture principle in order to arri\c if psychologically meaningful’ 
factors, has this to say 

In order to interpret the priiiniy hciors it is usually necis 
sary to examine the tests which ln\c high saturation on a factor 
and to discos er what is common to them When such an hypoih 
esis has been found it must be checked by examining ill of tin. 
tests which have zero or nearly \anishing saturations on the 
factor When a factor is fairly svcll understood then ns 

presence or absence can be predicted with some confidence in t 
new test which has not hnlierio been investigated factonally (42 
p ’) 

Thus, if a reference axis runs through a group of tests manifestly 
involving verbal skills, this suggests that the common factor is a 
verbal factor 

There can be no objection to observing certain behaviors and 
sniing that they are ivhat \vc shall mein by honesty or extro\crsion 
or aggressiveness, provided vve bear in mind the specific and limited 
meaning of the v\ords But certain cautions must be noted More 
often than not, additional meanings arc smuggled in md the is 
sumption IS made that the observations are, in fact, inierprctable 
as a sample of a known universe— that, in other words, the instru 
ment in question measures some process knov^^ to manifest itself 
in a specified universe of behaviors under a specified universe of 
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problem ot validity. Although the distinction between functional 
unity and validity seems clear enough when so stated, it is not, in 
fact, precise. In both cases we are concerned with relationships be- 
tween events, and many of the problems involved in the search for 
units are the same as those in establishing validity. Furthermore, 
some of the procedures which we have identified as appropriate for 
the discovery of organized processes have often been regarded as 
tests of validity. For example, the demonstration of internal con- 
sistency by item-test correlations has been called an index of item 
validity. Similarly, tests have been “validated” by correlation with 
other tests, although these methods have been discussed in this paper 
as evidence of functional unity. The notion of dynamic organization 
comes even closer to the complex of interrelations involved in valid- 
ity studies. The distinction is, then, not clear-cut. 

A case might thus be made for the unimportance of the dis- 
tinction between functional unity and validity, but it is convenient 
for our purposes to separate two aspects of the question of rclatcd- 
ness of process, whatever terms arc employed to identify them. One 
is the question of isolating parts with sufficient integrity for study 
and investigation. The other is the problem of giving these parts 
meaning by finding the way in which they fit into the whole pattern 
of events. The first question tends to push toward the discovery 
of simpler (though not necessarily irreducible) segments. The second 
raises issues about the general structure of the science. Ultimately 
the two aspects must be neatly integrated, for, as we have already 
suggested, the choice of part processes for investigation should be 
made in terms of over-all theory, and the nature of integrated theory 
will depend in part on the components s^hich are identified. Mean- 
while, our point is simply this; when indices are used to indicate 
organized and more or less isolable process, they are regarded as tests 
of functional unity. When, however, a given item or test or cluster 
of tests or molar segment is the unit of organization, other tests or 
items or processes are considered in some sense discrete. To discover 
the complete network of relationships of any variable to other vari- 
ables external to it is to give it meaning, to explore its validity. 

Face Validity 

It is possible to name and interpret any observed process or 
functional unity simply in terms of manifest similarities found in 
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the evidence of functional unit) bctvv'een two tests— sucJi factors as 
the weights assigned to test items, the extent to v\thich the variables 
conform to the assumptions implicit in the statistics used the 
difficult) level of the tests in conjunction with the distribution of 
the measured characteristic in the population sample, the selection 
of specific mathematical procedures for expressing the relationship 
All these pioblems are equally serious in validating a test against 
a criterion, and the reader is referred to the technical sources for 
more detailed discussion of them (4, pp 1215,^) Onlv two further 
points will be considered here the problem of the selection of a 
criterion and the limitations of this method of validation 

The choice of a criterion will be determined by some ptactical 
or theoretical consideration The practical need to select successful 
salesmen will dictate that some measure of success m salesmanship 
be taken as a criterion, or the demand for information on how many 
soldiers will go to college after leaving the Army quite obviously 
requires that a test of attitude toward this topic be validated against 
the criterion of actually going to college Again, if the theory is held 
that lack of self<onrtdcnce makes people poor judges of others, a 
test of self confidence might be validated against some measure of 
accuracy in judging others In reality, the selection of any criterion 
IS based on a theory about the relation of the measured piocess 
to the criterion situation or beliavior, but typically the theory in 
vohed IS some common sense assumption, such as the notion that 
a person who dislikes Negroes will be more likely to vote for 
segregating them than a person who likes them, or the assumption 
that the man who has a favorable attitude toward the church will be 
more likely Ca go Co church chan the one who expresses btccer resent 
ment against the church 

It will be suggested that this level of theorizing is inadequate 
for the validation of tools or constructs of maximum scientific use 
fulness For the present it need be said only that in the absence of 
an integrated theory of behavior which dictates the selection of 
validating entena, literally any behavioral event or resultant may be 
diosen if it is itself to be predicted or if it is judged to reflect some 
process of interest Practical common sense has often been the guide 

Any demonstrated relationship between a entenon and the 
data yielded by an instrument provides an additional fragment 
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conditions Thus, a test is assumed to sample adequately and in a 
systematic wa) the manifestations of some process called attitude 
toward Russia, or aggression, or honesty, or cohesiveness However 
reasonable the hypothesis may appear, the character and limits of 
the behavior universe sampled cannot be set merely by identifytn<^ 
a common characteristic of the items of a test and assigning to the 
measure the generality possessed by the name of that characteristic 
1 he limits must be determined 

Furthermore It is necessary to be certain that the manifest 

kroner/” " “"F Since unobserved and uncon 

oiled factors may be responsible for the order among the items 

ettamnt Z suggested, for 

?esm,ns?bWor he Army may have been 

srwtf : ^e fog^caT^rh;' “ 

sidcr that the order f ‘*>'5 was true, it is misleading to con 
are pres nun tnmrnrr ‘"t"’' The same problems 

■enuTamlnd an ax 1® on tests If the tests clus 

volved may well be th hord- are in- 

be entertained and checkrf''Th possibilities should 

common d.ffiailiy ,etl wth ““‘"P'"- “ 

or they might be le« r-i^ * different from those of other tests 

and so 'on One advlnmi^^l7"« “-an other tests, 

different sets of data inH ^^^*^***”5 same factors m many 

in minor ways is that ar ri*” which vary 

vays that accidental bases of correlation may be ruled 


Prediction to a Criterion 

instrument is^lheTimnle^Jr^**"*^ determining validity of an 
measures made by ihe^n«« correlation between the 

relation is high thp m*., criterion When the cor 

the stated criterion and ** P'^^^^^tion to 

college-aptitude tests nr t named by the criterion, as in 

The reader will recajJ^ 

e discussion of the factors which affect 



Problems of Ob|ective Observation 289 


deducing consequences, and testing the deductions under conditions 
of controlled observation. 

If behavior theory leads to deductions about conditions of 
change in process A and the effects of A on other processes, ways 
must be found to determine the accuracy of these deductions. When 
predictions prove to be correct, both the theory and the construct 
as measured are validated in some degree. Where the predictions are 
seriously in error, it is difficult to spot the trouble unless either the 
theory or the method is sufficiently well established to be above 
suspicion. Thus, if the orbits of a number of stars and the theory 
of gravitation all point to the existence of an undiscovered stellar 
body, the failure to find the star will be attributed to imperfections 
in the instrument unless the investigator suspects that there might 
be conditions under which even widely accepted principles no 
longer tell the whole story. If, on the other hand, both measures 
and theories are suspect, untangling the source of the trouble will 
involve the clumsy methods of trial and error which are familiar to 
the psychologist. In any case, it is clear that validation of theory and 
of instruments of observation tend to proceed simultaneously and 
that they can be separated only in so far as experience has accumu* 
lated to suggest that predictions made from a given theoretical 
structure tend to work out well when the events involved are meas- 
ured by one set of instruments and badly with another set; or, 
conversely, that although a given method seems adequate in testing 
predictions from theories A, B, C, and D, things go wong when 
predictions are made from theory X. 

There are still other complications which make the interprcia- 
cion oi the tsni:x}n{iTmed prediction ambiguous. Jmt as the behavior 
and position of a star is a prediction from the interaction of many 
stellar bodies, so must behavior of organisms be predicted from 
more than one variable. This means that the measurement of any 
or all of the variables and the theory that relates each of them to 
behavior may be in error. In so complex a situation, it is no wonder 
that validation procedures began in the oversimplified and naive 
attempt to predict single criteria from single variables, in the hope 
that by luck some clear-cut relationship would emerge. 

Even though the confirmation of a prediction clialks up a score 
in favor of the theories and methods involved in the prediction, the 
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o£ meaning, and, as the measures are shown to be related to still 
other criteria, they lake on further significance. I£, for example, a 
scale purporting to measure attitude toward Britain proves to be 
correlated with amount of historical knowledge, with church mem- 
bership, and with income, we know more about the scale than we 
would if only one of these relationships had been established. 
Such information may lead to suggestive hypotheses about the source 
of the correlations and thus to further studies. A protest must be 
entered, however, against the proliferation of blindly empirical 
validities which are without the disciplined guidance of theory, for 
the increment of meaning from the accumulation of miscellaneous 
correlations may ultimately approach zero. The short-run practical 
uses of this sort of validation procedure are evident in the design- 
ing of aptitude tests for selection purposes, but they do not appear 
to have provided the kind of interpretation needed in building 
a scientific structure. 

This limitauon is especially dear in attempts to validate meas- 
ures of those impbrtant psychological constructs which are known 
as intervening. variables (attitudes, motives, drives). It is useful to 
know that a questionnaire on attitude toward religion is answered 
differently by those who are church members and those who are 
not, but before such a concept can have any systematic significance, 
other steps are necessary, A theory about the structure and content 
of the altitude process and its interrelations with other processes 
in the determination of behavior must be worked out, and studies 
must be made to discover whether the hypotheses are supported. 
We turn, then, to a discussion of this concept of validation. 

Validations by Testing Predictions from Theory 

The essence of the approach to validation through testing pre- 
dictions from theory may be suted briefly. The meaning of any 
measured prwess is given not only by a description of operations 
used m isolating it from other pro<%sses and in assigning some index 
of quantity but also by knowledge of its influence on other processes 
and their influence on iL Consequently, to establish the validity of a 
ronstruct and of the defining measures is to conduct experimental 
investigations. This involves all the problems of formulating theory, 
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The di/Ference ihcn between an indicant and i measure is 
just this the indicant « a presumed eflcct or correlate bearing 
an unknown (but usuall) monotonic) relation to some under 
lying phenomenon whereas a measure is a scaled \alue of the 
phenomenon itself Indicants h \e tiie advantage of convenience 
Measures have the advantage of validity 

f 'his may seem to suggest a somcwhit more clear cut definition of 
ahdity than has been proposed here, by restricting it to the rela 
tionship between the measurement and the process measured This 
IS what we would really like to know, and it would be convenient 
if ways could be found to discover this relation without recourse 
to the complicated procedures of predicting from the measures 
and testing the predictions The trouble is that there is no direct 
access to the underlying phenomena It appears that we shall always 
observe indicants, for we cannot get inside and watch the attitude 
at work The hope is that we shall approximate more and more 
closely the law which relates indicant and the thing we want to 
measure That we have done so can be known only from the observa 
tion that if we assume some specific relation of process and measure 
our predictions to other even s are more accurate than when some 
other relation is postulated The relationship must be guessed at and 
tested by its fruits 

We return briefly to a point which was mentioned in the early 
pagCj of this chapter When evidence of the validity of an mstru 
ment has been obtained, it must be remembered that this validity 
has been demonstrated under a circumscribed set of conditions 
These conditions may be explicit or implicit their range may oi 
may not be carefully spelled out but they are in any case a neces 
sary pan of the description of the validity (or functional unity or 
reliability) of any instrument for it is unlikely that any relationship 
IS ever completely and unexceptionally true There are limits within 
which It holds and evidence should be provided about the condi 
tions which must be constant and those that may vary without 
disrupting the prediction on which the evidence for validity rests 
For example it would be questioned whether the findings of the 
California study could be reproduced in some other culture whether 
they hold for all motivational conditions for all levels of mtelli 
gcnce If they do not hold up the scales are inappropriate for tlicsc 
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evidence from one such confirmation is, of course, never conclusive. 
It must be supported by further checks in situations where the proc- 
ess measured in a certain fashion is combined with other variables to 
produce predictions of other consequences. By this procedure, the 
limitations of a construct, as defined by a certain method of observa- 
tion, will become apparent as evidence accumulates regarding its 
successes and failures. 


It is difficult to find examples among psychological studies 
which illustrate these points, because this method of validating 
measures has rarely been used deliberately, and never extensively. 
Certain aspects of the California studies of the Authoritarian Per- 
sonality may serve, however, to clarify the problem (2). The study 
was conceived around the theory that such psychological processes 
as attitudes toward out-groups, toward authority figures, toward 
butThr.'br„ “"'■entional morals are not independent 

Z nffr^ r‘- 10 conflict and hostility. 

to Ixmemes or ^ ego-integrated attack 

extremes of repression, projectivity. displacement, and ego-alien 

authoritv and c‘“' conventionality, and docility before 

omZL a ‘ attitudes of tolerance for 

evalfatio^n’nf fh “"conventional, and a more objective 

retlvee^ r presented involves correlations 

correlations am ° jncasiires of these various processes. The 
“ ^e and the me”® ’’ '"ould be expected if the thedty 

th^Ztmdv d However, it is obviom 

of thfhyp tet^ “ “'"f-tory test either 

attempt to Zn ”d- ,■ involved, since there is no 

control the variables Thel' conditions that systematically 

tiona! techniques in 'ZtudTrf'd "" ‘‘""“""'o"' oorrela- 

discussed on pages 278-282 ’ ' ‘*ynonuc organiration. which were 

nient invalidating Teasmer o" the present predica- 

that we are now Lalinv l ! ‘n'crvening variables, suggesting 
related by unknown hws^to tT ’’ '"‘V'’"'' ''o c-vHt "indicants." 
we are really interested We ^ •“fchological dimensions in which 
we obsene \erh-»l observe restlessness and infer drive, or 

'^rbal statements and infer attitude. He goes on to say. 
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different abilities from those required by the test as a whole. But 
when the meaning of unreliability is taken as instability of reaction, 
the character of the items introduces unreliability in so far as they 
are not sufficiently specific in their effects to tap the same process 
or instigate the same reaction on different occasions. Loevinger's 
second type of error refers to the numerous uncontrolled processes 
in the individual that influence reliability: motivations, sets, fotigue, 
boredom, and similar variables which change from one testing to 
another. 

Two other types of variable should be added which may modify 
the measured results from one observation to the next: (1) all the vary- 
ing social and physical stimuli which affect the reaction, and (2) vari- 
ations in the recording and interpretation of the behavioral events. 
This latter source of variable error may be isolated from the others 
by the familiar procedure of ascertaining coder or observer relia- 
bility through independent observations and independent coding 
of the protocols by several persons. The elimination of the other 
causes of variation that lie in items, in the total situation, and in 
the individual requires the application of the usual techniques of 
experimental control. 

This leads us to underline an important point which is not new 
but which is usually ignored in practice (22, 12). It is misleading to 
speak of the reliability of a test or a tool with the implication that 
the reliability or unreliability is a property only of the instrument 
itself, for the error observed is the result of variation in the whole 
complex of determinants of the measured event. A procedure which 
produces stable measures under one set of conditions, or with one 
sample, or with one person may not do so with other conditions, 
with other samples, or with another individual. As indicated at the 
outset, any statement about what a test measures and how reliably 
the measurement is made must be accompanied by information 
regarding the conditions under which the statement is true. 

It is necessary to examine briefly the operations for dciennining 
reliability in order to discover their relation to our conceptualiza- 
tion of the term and the attendant limitations of the procedures. 
The methods which involve correlation— the familiar tcst-retcsi, 
odd-even, or splii-half and alternate-forms reliability— indicate the 
stability with which individuals maintain their positions with 
respect to one another on repetition of the same tests or on diflereni 
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more objective evidence than this that the items do, in fact, measure 
the same process. But if intercorrelation between the forms is taken 
as an index of reliability, it cannot also be regarded as indicative 
of the communality of process. Of course, where the correlations 
are near unity, it can be said that the instruments are measuring 
the same process or the same set of processes and also that they arc 
measuring them reliably. But if the correlation is low, it cannot be 
clear whether this is because of the instability of the measures or 
because the different sets of items are in fact measuring different 
things. Our present techniques do not make possible the separation 
of these contributions to variance. 

To sum up, then, it has been suggested that reliability be 
thought of as referring to the amount of stability which measures 
or observation reveal when repeated under conditions which ensure 
that only random variable errors affect this stability. To say that a 
measure or observation is reliable does not necessarily indicate that 
a significant variable is being measured, or one that we wish to 
measure, or one that is uncontaminated by irrelevant influences. 
These are the problems of validity and homogeneity. To say that 
a measure is reliable means simply that the important determinants 
of the measured event^the instigating stimuli, the variables in the 
reacting individual, observational techniques, and procedures for 
handling the observations and reducing them to the final result- 
are all sufficiently under control for us to be able to reproduce 
results svithin stated limits. 

Whether an operation can be found which accurately represents 
the degree of stability of measures under conditions subject only 
to variable error turns on the problem of devising ways of repli* 
eating observations so that they will be independent of one another. 
Meanwhile, in spite of the ambiguity of the indices svhicli arc used 
for this purpose, the justification for their continued use is this: 
even tliough sve may be unable to estimate accurately the variable 
error on a test, we can say that low corrcl.ntions betsseen a test and 
a retest indicate that conditions are not sufficiently stable to justify 
using the partiailar method until the causes of the instability can 
be identified and controlled. On the other hand, if the correlation 
is high, the situation is a stable one that svarrants funher explora* 
lion in order to find out sv-helher ibe source of the stability lies in 
tome artifact, such as mere recall and repetition of a presious re* 
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«ilhm a certain range of values As Loevtnger pomts out, 

No matter hov. the rehab, t.ty coefficient is computed the 
statistics of rehabihty assume that («) the variable error factor has 
an expected or average value of rero (6) the error factor in one 
set of obtained score, is uucorrelated with that ,n another set 
however similar the test may be (c) the error factor in a set o 
scores is uncorrelated i«.lh the true scores and ^d) the variances 
of the error factors in two comparable tests are equal {32 p ) 


These assumptions obviously cannot be met if determinants of the 
reactions are changed in some systematic fashion from one observa 
non to the next, as by changing test items if residues of 
measurement affect the second, as when answers are rememberea 
or sets changed in one direction by the initial testing, or i a y 
other systematic, nonrandom shifts in conditions occur from test o 
test And if the conditions cannot be fulfilled, then reliabili y 
indices give an inaccurate and ambiguous mdicatior o£ the 
to \shich measures may be expected to fluctuate under t e give 
condition 

It IS clear that these conditions are met by none of the op 
tions for estimating reliability The test retest method, vt ic a 
tempts to follow the logic of replicating observations un er 
same conditions (same items same penons), runs into the i cu y 
that the measures on successive occasions are not independent, an 
that the variation from lest to test is not wholly random in character 
On the other hand, alternate forms or split half procedures pose 
the problem of how to separate the effects of heterogeneity o pieces 
from unreliability When two or more forms of a test are eii^ 
devised, the attempt is made to construct parallel items i at see 
to involve the same process, and then their correlation is ° 
assumed to be an index of reliability As Loevinger has msis e , 
howeser, this solution is unsatisfactory It is necessary to provi 

I2sce LiXMnger (32) for a disaission of il»c logtc of ihe nieihotls of 
Thnistoue SiKarrnan Brown Ktidi.r an^ Ricl»'*rdM>n and olhcn 
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more objective evidence than this that the items do, in fact, measure 
the same process. But if iniercorrelation between the forms is taken 
as an index of reliability, it cannot also be regarded as indicative 
of the communality of process. Of course, where the correlations 
are near unity, it can be said that the instruments are measuring 
the same process or the same set of processes and also that they are 
measuring them reliably. But if the correlation is low, it cannot be 
dear whether this is because of the instability of the measures or 
because the different sets of items are in fact measuring different 
things. Our present techniques do not make possible the separation 
of these contributions to variance. 

To sum up, then, it has been suggested that reliability be 
thought of as referring to the amount of stability which measures 
or observation reveal when repeated under conditions which ensure 
that only random variable errors affect this stability. To say that a 
measure or observation is reliable does not necessarily indicate that 
a significant variable is being measured, or one that we wish to 
measure, or one that is uncontamtnated by irrelevant influences. 
These are the problems of validity and homogeneity. To say that 
a measure is reliable means simply that the important determinants 
of the measured event— the instigating stimuli, the variables in the 
reaaing individual, observational techniques, and procedures for 
handling the observations and redudng them to the final result- 
are all sufficiently under control for us to be able to reproduce 
results within stated limits. 

Whether an operation can be found which accurately represents 
the degree of stability of measures under conditions subject only 
to variable error turns on the problem of devising ^vays of repli- 
cating observations so that they will be independent of one another. 
Meanwhile, in spite of the ambiguity of the indices which are used 
for this purpose, the justification for their continued use is tins: 
even though we may be unable to estimate accurately the \'ari3ble 
error on a test, we can say that low correlations between a test and 
a retest indicate that conditions are not sufficiently stable to justify 
using the parttailar method until the causes of the instability can 
be identified and controlled. On the other hand, if the correlation 
is high, the situation is a stable one that srarranis further explora- 
tion in order to find out whether the source of the stability lies in 
some artifact, such as mere recall and repetition of a presious re- 
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sets of Items which are assumed to measure the same process ** The 
reliabiht' of means, of differences, and of other statistical indices 
attempts to estimate, in accordance with the theory of error, the 
probability that the true value of the index (viz , a mean obtained 
from an infinite number of measures of the same universe) will fall 
within a certain range of values As Locvmger points out, 

No matter how the reliability coefficient is computed the 
statistics o{ reliability nssumc that (a) the variable error factor has 
an expected or average value of rero (6) the error factor in one 
set of obtained scores is uncorrclaied with that in another set 
however similar the test may be (c) the error factor in a set of 
scores is uncorrclaied with the true scores and (d) the variances 
of the error factors in two comparable tests are equal (32 p 6) 

These assumptions obviously cannot be met if determinants of the 
reactions are changed m some systematic fashion from one observa 
tion to the next as by changing test items if residues of the first 
measurement affect the second as when answers are remembered 
or sets changed m one direction by the initial testing, or if any 
other systematic, nonrandom shifts in conditions occur from test to 
test Ana if the conditions cannot be fulfilled, then reliability 
indices give an inaccurate and ambiguous indicatior of the extent 
to which measures may be expected to fluctuate under the given 
condition 

It is clear that these conditions are met by none of the opera 
tions for estimating reliability The test retest method, which at 
tempts to follow the logic of replicating observations under the 
same conditions (same items, same persons), runs into the difficulty 
that the measures on successive occasions are not independent and 
that the variation from test to test is not wholly random in character 
On the other hand, alternate forms or split half procedures pose 
the problem of how to separate the effects of heterogeneity of process 
from unreliability When two or more forms of a test are being 
devised the attempt is made to construct parallel items that seem 
to involve the same process and then their correlation is often 
assumed to be an index of reliability As Loevmger has insisted, 
however, this solution is unsatisfactory It is necessary to provide 

I2aee Loevmger (32) for a discussion of ihe logic of the nieihods cf Kelley 
nuiisione S|>earman Eiovn Ktidcr and R c lardson and others 
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sponse, or in having found a way to control the determinants of 
an important variable which is the object of study. 


CONCLUSION 

This chapter has reviewed certain problems of objective obser- 
vation and a number of methods for dealing with these problems, 
with especial reference to their logic and their limitations. Atten- 
tion has been given particularly to the concepts of functional unity, 
validity, and reliability. 

It has been argued that all these problems must be seen and 
ev^uated in the broad context of the assumptions and methods of 
science. Science busies itself with building comprehensive hypotheses 
about the relationships between events, deducing what must follow 
if these relationships are true and devising methods of observing 
the^ predicted consequences under conditions of controlled obser- 
vation. To delimit and define constructs (functional unity), to in- 
terpret their meaning (validity), and to produce evidence of their 
stability (reliability) involves working wi^in the framework of this 
logic. The design of objective instruments and procedures requires, 
therefore, a theory about the characteristics and relationships of any 
friable to be measured— i.e., its determinants, its dynamic inter- 
ependenci«, and its consequents. The evidences of functional 
unity, validity, and reliability thus obtained are, like all scientific 
^idence, subject to the limitations imposed by the conditions of 
the observations, for the discovered characteristics of the observa- 
tional procedures are contingent on those conditions. 
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form of the data. These materials frequently have to be recast in 
terms of the research problem in hand before they are fully usable 
by the scientist. 

There is perhaps another feature of these research methods 
that distinguishes, them. Because the data come ready-made, they 
do not depend upon the reach of a specific investigator or research 
team, whereas data that are obtain^ through scientific observation, 
through tests and questionnaires, and through interviews are gath- 
ered for a specific purpose characteristic of a particular research 
design and are drawn only from universes in space and time into 
which the formulators of that design can send investigators. Docu- 
ments, records, and indices, on the other hand, may bring together 
data for scientific analysis from remote times and places. 


DOCUMENTS 


Since we are here concerned with research techniques pertinent 
to social psychology, the documents that will be discussed are those 
that give insight into processes of interaction. A document may 
describe a process of personal or group development; the only 
limitation on the complexity of the situations dealt with is that 
its writer must be able to embrace the situations adequately in his 
thought and treatment. The scientist must find in the account given 
the facts he needs to perform a theoretically satisfying analysis. 

The category of documents thus delineated may be termed 
expressive documents. They are at one end of a continuum at the 
other end of which are such documents as court records, official 
histories, and proceedings of commissions. In between are such types 
as newspaper stories and memoirs of diplomats, which may give 
"humanistic” data but which rarely yield sufficiently detailed and 
embracing statements of the interactive processes. Expressive docu- 
ments have been discussed at length in three research bulletins of 
the Social Science Research Council (3, 7, 26). Two of the mono- 
graphs contained in these bulletins give extensive bibliographies 
(3, pp. 192-201; 26, pp. 164-73). 
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pa Jcipant in a social situation or process, the originator 
of a system of recording, or the creator of an index, determined the 
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in our day, revived somewhat the impulse to communicate one’s 
inner life freely and fully in personal letters. 

Life Histories 

Diaries are a second type of expressive documents. Although 
they have been used extensively by historians and have been rec- 
ommended by an eminent psychologist as "the personal document 
par excellence" (3, p. 95), they have not been used much in social- 
psychological research. Two examples in which attempts are made 
to formulate hypotheses, in part at least from diaries, are Runner’s 
study of social distance in adolescence (40) and Cavan's study of 
suicide (U). There seems to have been no investigation that has 
drawn upon large numbers of diaries. Actually it would be very 
difficult to collect enough of them at the adult ages to investigate 
nomothetically a social-psychological problem, i.e., to be able to 
develop scientific generalizations. Almost any scientific hypothesis 
would require the imposition of controls such as age, sex, and social 
class, and this would greatly limit the universe from which one 
could obtain the diaries. It is conceivable that one might stimulate 
the keeping of diaries by the group it was desired to investigate, 
but there would be a strong tendency for interest to vary so mark- 
edly that the documents would not be scientifically comparable. 

Allport, in The Use of Personal Documents in Psychological 
Science (3), argues for the idiographic use of personal documents, 
including diaries, in scientific wor^ He points out that better pre- 
dictions of the behavior of specific persons can frequently be made 
by careful study of their past than by applying to them generaliza- 
tions drawn from the study of populations of similar people. How- 
ever useful such idiographic use of documents may be for the psy- 
chological therapist, it would impose an unbearable burden on 
social psychology if it were adopted as the general route to scientific 
knowledge. It may be worth while to develop idiographic laws 
of behavior not only for maladjusted persons but also for a few 
outstanding leaders, such as Stalin or Nehru or Peron. The general 
advance, however, must come from scientific generalizations that are 
applicable to whole categories of people under specific conditions. 

If diaries are not kept frequently enough to give much promise 
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Types of Documents 

/i, expressive documents into (a) personal letters. 

.. either diaries, autobiographies, or what Abel (2) 

calls btograms (see p. 304). and (c) accounts of small-group process. 
As will appear in the discussions of the separate categoiies, expres- 
nve documents are not always ■■spontaneous.” for their production 
may sometimes be stimulated by the scientific investigator. This 
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youth organizations in peasant life, and in the latter it was to dis- 
cover the factors leading to participation in the National Socialist 
movement. In both cases the documents were obtained by setting 
up a prize contest which offered a large number of awards. Ano- 
nymity was assured. Those who were eligible and the area of life 
to be covered were carefully spedfied in the announcements. Abel 
believes that this method provides freely tvritten mass data on 
spedfic types of experience suitable for analysis by the sodal scientist. 
A somewhat similar technique was used in a study of German refu- 
gees from Hitler (4). 


Accounts of Small-group Process 

Accounts of small-group process by a partidpant are so rarely 
written spontaneously that they have not been used as the basis of 
any large investigation. In one case, at least, such documents have 
been stimulated by the researcher for a nomothetic type study. 
• Angell obtained from University students 50 documents on family 
life before and after the impact of the depression (5). The writers 
were paid a small fee. They wrote from a rather broad outline 
which suggested aspects of family life that were to be covered. A 
sample topic was: “Discuss the external conditions of your family’s 
existence prior to the decrease in income. Touch on the type of 
neighborhood, the house and yard, the family’s material possessions, 
etc.” A number of hypotheses about family organization under stress 
were tested by analysis of the data. 


Use of Documents 

The peculiar value of expressive documents is that in them 
life is discussed in terms meaningful to those involved. The pre- 
conceptions of the investigator do not determine the nature of the 
data obtained. To the degree that sodal psychology needs to under- 
stand the "definition of the situation" of participants, such docu- 
ments constitute an invaluable source of sdcntific information. 

Because expressive documents are rarely suffidently controlled 
by the investigator to afford a crudal test of spedfic hypotheses, 
they have generally been used in the exploratory rather than tlie 
final stages of the research process. Their greatest value, perhaps, 
has been in giving investigators a "feel" for the data and thus 
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of scientific usefulness, the case is even more discouraging with spon- 
taneous autobiographies, and really perceptive ones from a social- 
psychological point of view are rare indeed. H. G. Wells wrote one of 
^e ferv (62). Thomas and Znaniecki had real success in commission- 
ing a life history by a Polish immigrant to America (52. Vol. II, pp. 
1915-2226). This document goes into minute detail about almost 


every aspect of the writer s life. Clifford R. Shaw has likewise ob- 
tained excellent life histories from delinquent boys (43, 44, 45). 

We should also include in the present category documents that 
are obtained by a person's telling his autobiography. This is not to 
be confused with "active interview" records, since in such an inter- 
view Ae respondent is stimulated to deal with those aspects of his 
life that are theoretically significant for the investigator. Cultural 
Srratest use of spoken autobiog- 
raphiB. Kluckhohn, m his analysis of this technique, characterizes 
■t as the passive interview" (26, p. 125). He states that it is more 
'“phasizing personality, whereas the active 
nlwi 7 of « '“dr- Notable 

Ind on. r V by Dyk (20) 
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research problem was to dcteimlnc the role and significance of 



Documents, Records, Census Materials, and Indices 307 

which the informant i> writing, the more liJcelj' that alJ the theo- 
retically relevant material will be given in the document. This not 
only makes possible the framing of specific hypotheses but gives a 
riclmess of context that may produce a new conceptual orientation. 

Since the investigator is usually trying to arrive at scientific 
statements that have validity beyond the data from which they are 
drawn, he must be concerned with the representativeness of the 
data. This poses a well-nigh insuperable problem for one who 
would work with expressiv'e documents. Even if we allow that the 
possibility of dictation obviates the exclusion of those who cannot 
write, there is still the fact that some people are much more inter- 
ested in expressing themselves than others. It would take the great- 
est ingenuity to set up a random sample from any universe of 
persons such that each member of the sample had produced a 
spontaneous document useful for scientific purposes. Even when 
the sample is obtained first and the members of it are then stimu- 
lated to produce documents to order— which is the best procedure 
to ensure representativeness— the probability that they will all pei- 
form the task satisfactorily is very small. This is perhaps the main 
reason why expressive documents are thought to be more valuable 
in the exploratory phases of research^than in the definitive testing 
of hypotheses. 

The question of adequacy is not one that concerns only the 
documents written by the less intelligent. Even highly educated 
persons may not see all around a problem. They may recount only 
those aspects of it in which they are interested and leave out alto- 
gether aspects of great relevance for social-psychological analysis. 
Moreover, they may not give enough of the background and context 
to make clear the significance of the behavior described. The pro- 
vision of a broad topical outline is the best method of coping witJi 
this problem, but it can hardly be expected to solve it. 

The reliability of the record largely turns upon the truthfulness 
of the informant. How can one tell whether the author is truthful.^ 

A most interesting finding in this connection is that of Frenkel- 
Drunswik: persons rated, by associates as unreliable are prone to 
superlative and absolute statements and to excessive repetition (24). 
Possibly such a criterion would enable the investigator to reject 
Unreliable documents. 



306 Methods of Data Collection 


producing 'hunches with respect to the most fruitful way of 
conceptualizing the problem The research scientist must become 
intimately familiar with the situation under study, and one of the 
best ways to do this is through careful reading of insightful expres 
sive documents Thomas and Znaniecki, for instance, conceptualized 
three types of personality-the Philistine, the Bohemian, and the 
documentary materials (52, Vol II, 

Expressive documents are capable not only o£ identifying the 
s^ificant variables in a specific problem but of suggesting hypoth 
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view expressed earlier that a focused document gives the social 
psychologist the degree of “density of data" necessary to the full 
employment of his theoretical formulations. 

This general review of the uses of expressive documents in 
social-psychological research shows that they have genuine value, 
particularly in the development of concepts and hypotheses, but 
^hat for a complete piece of nomothetic research they need to be 
supplemented by other approaches. Several good studies, such as 
Thrasher's The Gang (53), illustrate combinations of techniques. 
Over a period of seven years Thrasher directly observed the lilc of 
gangs, persuaded youths to write their own stories, interviewed gang 
members, examined social-agency and court records, and clipped 
newspaper accounts of gang activities. Although he developed hy- 
potheses of a broad type, he did not bring them to a definite test. 
Pauline Young, in her study of the Molokan sect in Los Angeles (63), 
iil^ewise employed several approaches. Passive interviews were ilie 
source of most of her documents. She also drew on her own obscr* 
'^ation, newspaper accounts, and public records. At one point <Iie 
used delinquency statistics to validate a hypothesis that she formu- 
lated from the documentary materials. 

Although much of the research done with the aid of expresuve 
documents has been rewarding, it is interesting that no study has 
gone through the scientific process to the point of verifying hypotli- 
®ses on data different from those with which they were developed. 
Although this is very expensive and time-consuming, we shall not 
feally know the value of this icscarch technique until someone 
does this. 


REGISTRATION AND CENSUS DATA 

A great body of statistical data about human populations is 
collected by government, business, and private agencies. All too 
frequently, when individual investigators plan their on-n research, 
*hcy ignore data available from such sources. Altliough caution must 
observed in the use of these data, they may fumiih valuable aids 
to many plias« of social-psychological research. A useful claisifica. 
don of such data divides them into registration and census data. 
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A quite different problem is the reliability o£ the interpretation 
made by the social psychologist How do we know that another 
scientist would see the same data in .he document? This can be, and 
has been, tested (10, 12, 48) The upshot of these tests is deBnitely 
encouraging Well trained investigators will agree to a scientifically 
satisfactory degree on what the personality traits and attitudes of 
persons are and how they are likely to behave under specified cir 
cumstances In order to convince others that this is true, it is highly 

hXdM. “u documents as possible be pub 

would av^^ other investigators can determine to Uat extent ?hey 
would agree with the interpretations made ' 

that'^re'mtermeta. “"o '‘"ow 

psychologists, fs rrr; The^'IdyTecklm 1 

adiary. wura«ual tes^ tocluding 

the validity of the internreia?™^"!” and found that 

was greater than the rehability toweef th' 
this means is that each man tnierprcta lions What 

in certain areas of attiturfKi i sound interpretations 

were not soSy inte:p“eLrbv^^''’r 

tcrpreters of docLnents is not neces'sa^df among in 

as has usually been thought ^ tiamaging scientifically 

Prolubttton' oflsSwmer'rofbiogfrms^Th '““■'■'des- toward 

by correlating them with tbs . i5 ratings were lalidaled 

The validity coefficient was 081! Th'* “ '"=^"o'cd scale 

that obtained m the u ** considerably higher than 

-dy ihe rehalditat ’ 

reliability and vahitv ^ !f ? i^ighcr still 0 96 The higher 

from the fact that fh certainly resulted 

focused on a narrow validating scale were 

wright and French attitudes toi^ard Prohibition” Cart 

eral character how expressive documents of a gen 

questions comprising a oLiTi “o»ver a broad battery of 
valid .nfelenc«^ '-o-fd seem to show 
a specific area arc adequate ’rf' ‘''"’“curaentary malcrials for 
9 This IS perfectly compatible with the 
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prisons, mental institutions, certain courts, and personnel depart- 
ments of many business firms. Usually, however, such psychological 
data are not accumulated in tHe registration process, but even when 
they are not, registration data may still be useful in social-psycho- 
logical research. 

A census is a periodic collection of data about a population, 
usually taken by a house-to-house canvass. In the United States and 
other urban societies, the census has become the source of great 
masses of data about human behavior. Although the principal col- 
lection agency for this kind of data in this country is the United 
States Census Bureau, valuable data are also widely collected in 
school censuses, in real-property inventories, and in occasional enu- 
merations by other agencies. 

The United States Census covers an enormous range both in 
terms of data collected and the types of geographical units for which 
they are tabulated.' For example, the questions in the 1950 decen- 
nial census of population and housing cover such items as age, sex, 
family composition, labor-force status, nativity, amount and source 
of income, education, migration status, type of dwelling, state of 
repair of dwelling, rental, etc. For the 1940 census some items of 
this type were tabulated for units varying in size from the United 
States as a whole to individual city blocks of large cities. Cross tabu- 
lations of many of these variables were published for a variety of 
geographical units. The census tract data available for small areas 
of large cities have served as an especially important basis for many 
social science studies (17, 41, 47). Historical comparisons with earlier 
decennial census materials may be made for some variables. In a few 
cases these go back to the first census, made in 1790. 

In addition to the decennial censuses of population and hous- 
ing, significant data are available in the Census of Business and 
the Census of Manufactures. Data on institutional populations are 
collected in connection with the population census. 

The Census Bureau has recently been collecting and publishing 
certain current types of data bn a sample basis. At the present tunc 
these appear under various series of Cmrent Popu ati(m 
which include monthly reports on the sizes and comjmsitjon of the 
labor force. There are occasional reports on such subjects as family 


I For a summary of' tho ,copo of the pop..I..ion «mu.. Hauler (28). 
Ditre are occasional ^ides to available census data (M. 58). 
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The Nature of Registration and Census Data 


Registration data consists of records made at the time of the 
occurrence of an event in accordance with legal or administrative 
regulations attached to that event. Such data cover a very wide 
range of events and comprise a massive record of social life. 

The following list is illustrative of important activities covered 
by registration data: 

1. Vital events: Births, deaths, marriages, 
divorces, morbidity 

2 Education: School attendance, grades, 

, performance on psychological 
tests 


3. Crime; Crimes known to police, arrests, 

court actions, prison records, 
parole records 

4. Voting: Registration, voting 

5. Social security payments and benefits ^ • 

6. Automobile registrations 

7. Draft and Army service 

8. Illness: Hospital and insurance data 

9. Bumei! activity: Payrolls, production records. 

absentee records 

10 rarmal organizations: Membership, office-holding 
committee participation 


..vc/vne* I* ^ extended almost indefinitely, because an 

^ cn'ent in an urban society is an elaborate record keeping 
v^riV^ ^ knowledge, planning, and control. A wide 

fimrtton* routinely recorded as a normal part of the 

. ° system. The information about the events 

intrinsic value, but its usefulness is greatly enhanced by 
srhnnl i»n^T °^niaiion also recorded. Tor example, records of 
nrmniti ^i^Rucntly have data on the nativity of parents, 

3iher, place of previous residence for migrants, 
scort^n psychological tests, school grades, etc. 

.. **^\°y* ^ hinds of registration records which contain 

direct piychological measures. For example, psychological test results 
an lat^oses m.ay be a systematic part of the records of schools. 
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with reference to individuals by examining the psychological cor- 
relates of social cohesiveness as intervening variables. 

A number of sociologists have made studies of deviant behavior 
in which registration and census data have been used to obtain 
general sociological relations and personal documents and case 
records have been used to get at psychological processes relating to 
such a framework (11, 35). Clifford Shaw (43, 44. 45, 46) followed 
this pattern in studying both ecological data relating to juvenile de- 
linquency rates and personal-history documents of delinquents. 

The possibilities in combining current survey data and regis- 
tration data have hardly been explored. For example, for a sample 
of precincts it would be possible to combine survey data on social 
participation, reference groups, and political activity with regis^a- 
tion data on voting, split ballots, registration for voting, campaign 
expenditures, etc. In this way a study could be made of the relation 
between attitudes and social organization of local populations and 

their political behavior. , , . . . 

Another example of the use of registration data is Khnebergs 
study (31) of selective migration and intelligence. This study de- 
pended on school grades, intelligence-test scores, residence histones, 

and demographic data found in school records. . 

• Closer to*^ the usual interest of the social psychologist is utilita- 
tion of registration data to select a sample wit speci e ^ “ 
teristics for a study in svhich additional data will be gathered. 
From the point of view of experimental design, *is may have wo 
purposes: (I) to select a number of groups which are sun ar wit 
tepect“to important characteristics; such equate 
Ihen be divided into control and experimental groups “ = 
the equated characteristics as variables in 2) 

groups which differ on a characteristic which is a va 
Study in the resparch. .. „ - 7 /rx 

The first purpose is illustrated by "F. j 
with school children in svhich they are divided 
nxperimenlal groups on the basis of school "“tds. Similaidy, in tl e 
him experimems repor.ed in S.udies in Soc.at m I or d 

'Fur // (29) the control and experimental groups were chosen b, 
ntatchinfr^nita^ units which had roughly similar cl.arac, ensues 
According to Army records. 
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composition, migration status, income, education, characteristics of 
dwelling units, etc. Most of these are national data, but occasionally 
they are broken down on a regional basis. 

In addition to the great mass of published data, it is frequently 
possible to obtain copies of unpublislied tabulated data for the cost 
of reproduction. Frequently, special tabulations will be made by 
the Census Bureau at cost, on request. The introductory sections 
o£ most publications of the decennial census contain references to 
the scope of tabulated but unpublished data. 

From time to time important methodological and substantive 
monographs are issued by the Census Bureau. Examples are Stall 
Ccouomie Areas, by Donald Bogue (59), and The Growth of Metro- 
pahtm U, studs m the United Stales, by Warren Thompson (55). 

Uses of Itesistralion and Censni Data 

different' registration data are collected in widely 
remTalf'rtu"'^ continuously through time. This permits the 

erTm .emST'"' a.‘" P'“c« and under dif- 

cornare birth ' Foe example, it may be possible to 

of the nooulatinn ""i ““endance, suicide rales, the percentage 
between rural elections, war bond purchases, etc., as 

various phases of the'‘Lsin«s“de^'' 

ship''be°w«„'‘l°' cegjstratiou dau i, in the study of the relation- 

problem Soc"aS', °!°®r' “ a research 

the framework prLide?™ ■hrmSioTb"!''" '"’•'r'' T'''’'’ 
ational variables. : ^ relations between social and situ- 

social variable! raTtudy^w'D Th"' ‘•nla for relating 

of suicide was related to ^ ®urkhctm (19) in which the incidence 
These groupb/stem,;”'”"^!' certain social groupings, 
social cohesiveness. The result ua"“’ according to their 
degree of social cohesiveuess to 1 ^“—^ 

DurUieim himts^ir r i *** incidence of suicide. Althougli 

ogh. s r V i ^ P^ycbol- 

c ground material might stiidy this relationship 
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However, through "matching" studies it is sometime possible to 
combine census and other data as follows (36, pp- )■ 


The Bureau ol Census will compile statistics from speciBc 
census schedules of lists of consumers, furnished by business con- 
cents. A given finn may provide certain information about us 
clients wWch is transcribed and placed on punch cards a ong with 
information taken from the census schedules of the sam pe^n. 
Cross tabulations are then prepared to »’™ rel^-°-'-.ps between 
the characteristics known to the census and * 

enterprise. No data are given out in terms of individuals but only 

in frequencies or summary fonn. 


Historical Series of Registration and Census Data 

different time series of events, i„ economic 

census data may be important. The use o socioloeical 

research is highly developed. A Ses. 

studies have related time series of social events 

wars, and other crisis events; ..,„m,nt 5 of osvcho- 

Althdugh time series involving direct tl^Le 

logical variables are rare, them ":“"^„TTurveys 

based on asking the same altitudinal ques lo important 

(8. pp. 22(h232)" such series are f “^nuTty 

as the number of annual surveys wit •„ m accumulate for 

increases. For example, such data are annual 

psychological variables affecting Survey Research 

Survey of Consumer Finances conduc y p.jarai Reserve 
Center of the University of Michigan been 

Board.s Similarly, a number of '“"'Jj^^^f^oreign policy in suc- 
collecting comparable attitudinal wiIl|row in number, 

cessive surveys. It is likely -that such ^ 

Their importance will be greatly f" established 

cient time period to make it possible 

time series of census and registration ala. already available 

The extent of the historical statistical series alrea y 

.1 Sn The FeitTcl 

2 Reports on these siinc)-^ app*^’’ *'^1*'*^" > 
nuf/fim. 
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The second purpose is illustrated by investigations in the field 
of industrial relations in which plant records of the worker's pro- 
ductivity, demographic characteristics, and work history may be 
related to current survey or observational data {15, 22). 

- Another type of research is illustrated in the study by Chapin 
and Jahn (14, pp. 41-50) in which data from WPA and relief records 
were used in conjunction with morale measurements made by the 
investiptors to study the effect of type of relief on the morale of 
the recipient. Studies of the effect of certain types of appeals on wai- 
bond sales (9) also depended on the records of war-bond sales as the 
criterion variable. In these various types of studies at least one of 
the principal variables was measured with registration data while 
another was measured directly .by, the investigator. ‘ 

Whenever, aj is usually the case, a field-research investigation 
has a geographical context, census. publications provide a valuable 
source of descriptive matetial about the population in the area 
involved, • ' * 


Chapter 5 has already indicated that census data are an impor- 
taut basis for the construction of area samples. Census data may also 
In communities as the units for research study, 

In the study Public Response to Peacetime Uses of Atomic Energf 

\ atomic enhrgy installations were 

matched on census characteristics with another sample of comraunl.- 

Then, certain attitudes of tlie 
members of the paired communities were compared. 

relatinmV'®‘”"‘‘°" used to study the 

nroblfm T* social and economic variables relevant to a 

Dsvchnln * generally do not include direct measures of 

E ound f<^>at;onshlps among back- 

S a„,rrf, “ f«<l“'ntly important for-the social psychol- 
examnle • ^ facilitate interpretation of other data. For 

nurchas« Mth T ^ social-psychological factors affecting home 
data on fh ' * useful to refer, in analysis, to current census 
data on the tenure status of recent migrants. 

generally possible to link 
d.t. '".‘‘‘''i‘*.‘'als directly to other social and psychological 

•„f >"<lividuals, because tlic Census Bureau must keep 

orma ion a ut individual respondents completely coniidential. 
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I made between the conflicting goals o£ comparability and current 
J alidity. The items and weights are revised from time to time but 
ot each year. When the revision is made, both the old and the 
2 W series are computed for a while for linkage purposes. 

It is frequently possible for the individual investigator to con- 
->‘”uct indices for a specific purpose by combining or manipulating 
,''"ies of published data. Although such indices usually do not meet 
^ , ;id scaling standards, they are frequently useful as rough measures 
variable. Examples of such indices based on registration and 
' isus data are those for ''plane of living” (25), 'moral integration 
' and "segregation” (30). Ingenuity on the part of the researcher 
.i'^uently enables him to use such indices as rough measures of 
;ial variables. In many field situations the range of variation is 
, great that even rough indices will serve to differentiate at a 
^ -ffaciory level. 

Ex Post Facto Design^ 


Registration and census data frequently provide data for con- 
ing groups which have already been differentially subject to 
stimulus. Studies based on such data have become known as 
(f ' ,<3st facto studies. Greenwood (27) defines tlie "ex post facto 
. -'.'iment as one in which ". . . we work backward by controlling 
* i • , the stimulus has already operated, thereby reconstructing wllat 
’ J have been an experimental situation. This is to say t at t e 
'■‘..•^us is not controlled by the investigator. The nature of the 
imental manipulations the investigator can make are strictly 


■/he Chapin and Jahn study (14) of the relation of t,pe of 
, o worker's morale, cited earlier in this cliaplci, is an example 
■ ,’x post facto study based on registration data. In this case, die 
^is (type of relieO was not controlled by the investigator A 
. .of pe^ns receiving work relief through the WPA ''patched 
' group receiving direct relief but eligible fo, WPA work 
The characteristics used for malcliiiig were age sex race. 
-* . amount of education, usual occupation, sue of family, and 

/ of time on relief. Morale measurements were made on both 


/ , Idem, inicrotixl in an intcn.i.e nud, of ll.c c. post (...In doif-n .ho..ld 
, ‘ wnnood's (27) careful anabses. 
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IS indicated by the fact that approximately 3000 statistical time 
iiTed fro™ '789 tJ 1945. have bet lb 

statisl j (57 7'"vl":rert' ' 

pressed in connectio'n with th: „'s7„^l:^ “ 

a non, Id “Ganges m concept and coverage over 

'vnh thl^ille^d t 77“““ “ ’P^" 7-- Conp J 

lished historical tables deli' ‘“"r* employed in pub 

p-mat,onor:a;iri::r„tt7:d" " - - 

T-het/seo/Pnilir/ied/ndice, 
bmation or otheTlalthtio^^ 0 ?'''''^ 

lilustrations are the Cost of Liv. and/or census data 

ness conditions, county levelof Ii7m "'^7'’ indices of busi 

rates, the F B I crime rate index off/ J“''nnile delinquency 

rates (birth, death, etc) Itnown to the police, vital 

The conflict between ib. ■ , 

current validity is a general orSf’ '"s‘orical comparability and 

W^ely to be especiallf .mtrLni ^ “"d » 

^1! problem may be illustrated '” 1 ,*^' constructed indices 

Index (60), which is essentiall ^''^rence to the Cost of Living 
a fixed list and quantitv of e ">“S"re of changes in the cost of 
“eras was originally basL ™ a"™™" '"■gh'-ng of the 

rarnen and clerical worteT. ToT?' “"'“raption 7f wage 
-ere revised on the Zn „r ”e™s “"9 weights 

racomc families m igxn / “ *'“'‘7 expenditures of moderate 
“ would be desirable 10 irarpeses of comparison over time 
only ,he cost of buying th» 

Hoiveser, from the 00101 ^ 77 ' Soods svonld var, 

result in an unrealistm Le '“"'i '*■“ ™'Sfi‘ 

not represent current comilmo'tmrh h "T 

ption habits In practice, a compromise 
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IS made between the conflicting goals of comparability and cuirent 
validity The items and weights are revised frorr time to time but 
not each year When the revision is made, both the old and the 
new series are computed for a while for linkage purposes 

It is frequently possible for the individual in\estigator to con 
struct indices for a specific purpose by combining or manipulating 
series of published data Although such indices usually do not meet 
ngid scaling standards, they are frequently useful as rough measures 
of a variable Examples of such indices based on registration and 
census data are those for plane of living (25) moral integration 
(6). and "segregation (30) Ingenuity on the part of the researcher 
frequently enables him to use such indices as rough measures of 
social variables In many field situations the range of variation is 
so great that even rough indices will serve to differentiate at a 
satisfactory level 


The Ex Post Paetd Design^ 

Registration and census data frequently provide data for con 
trasting groups which have already been difleientially subject to 
some stimulus Studies based on such data have become known as 
^ post facto studies Greenwood (27) defines the ex post facto 
^periment as one in which we %vork backward by controlling 
®fter the stimulus has already operated thereby reconstructing what 
^’ght have been an experimental situation This is to say that the 
stimulus IS not controlled by the investigator Tlie nature of the 
f^pcrimental manipulations the investigator can nnke ire strictly 
hmiied 

The Chapin and Jahn study (14) of the relation of tjpe of 
tclief to worker s morale cited earlier in this chaptei is in eximple 
°f 2 n ex post facto study based on registration data In this case the 
stimulus (type of relief) was not controlled by the insestigator A 
gfoup of persons receiving work relief through the W PA was matched 
"'■th a group receiving direct relief but eligible for WPA work 
lef characteristics usetl for matching were age sex race 

|>ativity, amount of education usual occupation size of family and 
of time on relief Morale measurements were made on both 

inierested in an intensive unH) of tlic ex pnsi fitl< design should 
Greenwood s (27) careful anahses 
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groups The group receiving work relief scored significantly higher 
on morale tests than the group on direct relief 

An example of ex post facto research based on census schedule 
data IS the study by Freedman and Hawley (23) of the relation 
between unemployment and migration In this case the migrants 
in two Michigan cities were matched on a number of relevant 
census characteristics with nonmigranis at ilietr place of origin 
The two groups were then compared on unemployment laies prioi 
to migration The premigration unemployment rate for the migrants 
was slightly higher than that foi nonmigtaius The dilference was 
much less than had been expected on the iheoiy that unemployment 
was an important cause of migration lor comparable groups 

A large number of other suggestive ex post facto studies have 
been done without close matching on control variables 1 he largest 
number of empirical sociological studies are of this character \n 
excellent study which made ingenious use of registration data to 
investigate an important problem is that by AoJ^ Reiss, Jr (38) of 
the relation between juvenile delinquency and the failure of per 
sonal and social controls This study is based on the case materials 
me uding psychiatric diagnoses and the case records of social work 
ers, for 1 1 10 juvenile delinquent probationers in the official juvenile 
court records On the basis of his analysis of these data Reiss reached 
i e o owing conclusions, which are illustrative of types of impor 
mnt re ationships which may be invesiigated by various ex post facto 

Our observations show (I) that delinquent readivists are Jess 
often than non recidivists members of social groups and live in a 
social miheu v>hich is chwacterued by norms and effective tech 
niques in producing conformity behavior contra delinquency 
(2) that delinquent recidivists less often Accept or submit to the 
control of social groups which enforce sucJi conformity belnvior 
than do non recidivists md (3) tint delinquent recidivists are 
CM o ten persons with mature ego ideals or non delinquent social 
rn es and appropriate and flexible rational controls which permit 
lie individuals to guide action m accord witli non delinquent 
social group expeaaiions (38. p 204) 

, limitation of ex post facto studies of tins 

in is t at t le members of the control and experimental groups 
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are "seU-selected" rather than randomly assigned. ™eans ttat 
the differences found between the control an nroups 

may be due to factors connected with charactens i matched 

o/er than the experimental 

For example, in the Chapm and Jal y i t •- hoth the 
„ poll..... 

stimulus variable (type of relieO ai^d 

Despite the matching process, it is possible " „f„lief 

differed in morale even 

programs. Similar uncontrolled seiec . j ^ 

*e results of the Freedman-Hawley irsTudies, 

limitation, the ex post facto design may possible. It is 

especially if replication under ^ jgvel of generaliza- 

important, howmver, to understand the ® 

tion which is usually possible with this type 

Profeiems in the Use of Registration Data 

The fundamenul limitations of , foot generally 

psychological research arise from the ac The definitions 

collected for the specific purposes ol su data may 

and tabulations used in calculating a p collecting data 

differ from those which the researcher w „ ^i,ich occupa- 

tor his own purposes. mav not be ideal for the 

donal data are recorded in schoo index of social class. 

investigator who wishes to use occupation 

Most of the limitations of registrauon standards of valid- 

faet that the investigator cannot.impose his osvn 
ity and reliability on the data. r#»ffistration data varies from 

The completeness of ® fc-iency of the data collection, 

time to lime in accordance with tn . ,vhich the populauon 
the nature of the data, and **’^*”*^^ „ple it has frequently b«n 
has to record the event involved. For c ^ country undergoing 

observed that a recorded rising birt that the birth rate is 

modem development is not that the efficiency of 

actually rising. It is more liJ*cly i“ ‘ ,i,h other statistical services, 
birth registration has improved a o S incidence of some reps- 
Similarly, variations in the ^ of data coH-ction 

'ered event may reflect variations m 
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rather than variations in the event itself. Kticzynski (32, p. 8) reports 
that extreme variations in vital rates of certain countries resulted 
from periodic campaigns to register events not recorded in previous 
periods. The extreme irregularities in these rates were largely sta 
tistical artifacts. 


Because registration data are collected as an incident of admin- 
istrative processes, they may suffer from their particular context. 
Income-tax returns have obvious limitations as accurate measures 
of income in many countries because of the motivations for lax 
evasion. Robison (39) has pointed out that the chance that a delin- 
quent act will be officially recorded is related to the social status 
and ethnic background of the delinquent. Similarly, Sutherland's 
study (51) of white-collar crime indicates that certain crimes com- 
mitted by persons of high social status less fiequenily reach certain 
recorded stages of the police or judicial system. 

>id '''' ‘■"''“tigator should con- 

and l.,h.r h' “ "W' ‘o give .he information required 

initr^a, „„ T'f ■■ motivation to secure the 

ISlIteX "caTTy- p. 135) reports that a. various 

as "died suddent ° certificates has been recorded 

"vitfl s.a.is&h'’ ® "."■icmh caused by five doctors." 
by unoualified I svhich the repot t was made 

Shi., in diagnoses of' thes'e cruse: r‘ed“frs“n‘"''”“' 

.i.e rerx^n ‘t’’' "ccessafy.inform^.ion but 

FcrXmp::,;:x:eThnnrx 

the record of some event .lie re “S incidentai to 

reicvance for his agency or i.e m *^7 ““V 

suie that retnila»j« registration data, it is important to be 

siorircorparrrTi"’'® °- 

pcriod of time is no indin, r""'’." P”’'" 

income stibi^n income Icsels if the minimum 

income subject to taxation has changed. 
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A special problem of this kind may be changes in the practice 
ot assigning the event to the place where it occurs cr to t ^ 
residence of the penon to whom it occurs. The registration o ir s 
and deaths is a case in point. At the present time t ese ^ ^ 

tabulated both as to place of occurrence and place of legal residence. 
That such allocation of the events is an important issue ^ 
horn the fact that in 1940 21 percent of all 
coded "nonresident." This percentage varied ^ 

rural places to 40 percent for births in cities of 2a00 to 10,000 

population (61, p. 17). _ studies 

In some cases the reporting agency m y .h. investi- 

to test the reliability or validity of its data. In other as , 
gator himself should make such tests whenever 

tor internal consistency in the data or by comparison w * other 

reries. One indiation of the covemge of a ccgtr‘r“ 

1950 check comparison of birth registation svit i ■ j 
in the 1950 U.l Census (42). For the ^ntted ^ 

•tirth reporting was estimated to be 97.8 percen jjq 0 

tor individual states ranged from 88.1 percen seven 

percent for Connecticut® It was less tl»n 5 per^^t n on^je^ ^ 
states. The 1950 6gure of 97.8 percent for the U 
"hole represents Lrked improvement from the 1940 figu 
92.5 percent. . . to ,he limita. 

The extent to which repstration d o jministraUve con- 

tions we have presented varies widely w collection is 

text in which they are collected. In “ 'arch purposes of 

intelligently and specifially guided by it is obviously 

the agency or tor general research purposes. . ’ jata should 

unportant that the researcher who uses rep ' ,„i,ich may 
know something of the special that data are pub- 

affect the nature of the data collected. T quality, 

lishcd under official auspices does not guarantee 9 


Problems in the Use of Census Data •• j „ i, 

To a certain extent,- the census, M a '““"^tTmportant, the 
subject to the same limitations as regis ra ‘ ^ completely 
definitions used in data collection research. It is 

appropriate for the purposes of a speafic p 
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also true that over a period of time changes are made in the classifica- 
tion system and these may produce errors in interpretation if they are 
not clearly understood. For example, between the 1930 and 1940 
census the basis for identifying the working population was changed. 
In 1930 and earlier years, the working population was identified 
with gainful workers— persons having a gainful occupation regard- 
less of whether they were actually working or seeking work at the 
time of the census; since 1940 the “labor force” concept has been 
used to classify workers on the basis of their activity in a specified 
period at or immediately preceding the census. » 

The importance of understanding the working definitions by 
which data are collected is illustrated in a recent sample survey 
by the Census Bureau (56). In'this survey 844,000 persons classified 
as having moved between farm and nonfarm residences had not 


actually changed their physical location. Their classification as 
“migrants” resulted from a change in the use of the land on which 
they were resident. Similarly, estimates of the amount of unemploy- 
ment based on sample census data vary according to the definition 
of wliat constitutes part-time employment, the status of unpaid 
family workers, and the Hne of demarcation between those who 
are unemployed and those not in the labor force. These illustrations 
<^ument the common sense point that terms found in published 
tablw should not be taken at face value. A careful study should be 
ma e of the definitions usually found in census volumes and the 
instructions to enumerators for collecting data. 

In general, the data of recent U. S. censuses are of a relatlsely 
high level of reliability and validity. The personnel of the Census 
outstanding group of statisticians and social 
ttiA r n limitations of an operation of its magnitude, 

. , ®J'**J* ureau provides a model in many respects for meth- 
odologiral aspects of survey work. Where known sources of error 
. ey are usually pointed out in census publications. There 
^hweasing effort to incorporate checks on the quality of 
,the data into the operations of the Census Bureau (34). 


SUMMARY 


f ^ ** II «lealt with the use in research of certain types 

c ^ persons other than the investigator himself, 

uc aia are avatJable in great quantity and for a wide range of 
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important problems as a 

documentation in our society. The princ p n„5.1bilities 

is that the operational definitions of the data 

of experimental manipulation are outside ^ impor- 

Although this restricts their usefulness, such data rern ^ P 

tant. They provide unique access to historical 

.0 some L"ent social situations -hich a- 

expensive to observe. Moreover, these ar 

setting. 
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CHAPJER EIGHT 


The Collection of Data 
by Interviewing 


Charles F. Cannell and Robert L. Kahn 


In almost every field of human thought it is possible to observe 
indications of the laborious ascent from superstition and mysticism 
to scientihc fact, Such observations reveal that improvement in the 
systematic collection of data is a major characteristic of scientific de- 
velopment. In the long-established physical sciences the instruments 
and techniques of data collection are well developed; in the social 
sciences the development of techniques for measurement and quanti- 
fication has recently become a focus of effort and attention. 

To some extent the needs of the social sciences for data can be 
met through techniques of observation and physical measurement. 
To an increasing degree, however, social science is demanding data 
which must be reported by individuals out of their own experience. 
Attitudes, perceptions, expectations, anticipated behavior, arc avail- 
able to the economist, sociologist, psychologist, and anthropologist 
only through such direct communication. 

In a sense, of coune, social scientists have always •‘communi- 
cated’* with jseople and derived insights from such communications. 
The problem for social science is to transform the highly subjective 
process of ••getting insights** into a s)seematic method for the collec- 
tion of social data. This chapter discuises some of the principles am! 
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techniques by which the process of interviewing can be made to 
approach the criteria for scientific measurement. 

Criteria for Scientific Data Collection 

The adequacy o£ a technique for collecting data is ordinarily 
judged in terms of criteria of reliability and validity, concepts which 
are discussed at length in Chapter 6> Reliability requires that re- 
peated measurements yield results which are identical or fall within 
narrow and predictable limits of variability. The criterion of valid- 
ity demands that the measurement be meaningfully related to the 
research objectives; that is, that it measure what it purports to 
measure. 

Both these criteria apply not only to the data-collection instru- 
ment but also to the technique and procedure specified for using 
the instrument. The reliability and validity of social data depend 
not only on the design of the questionnaire or interview schedule 
but also upon the manner of administering the instrument, the 
technique of interviewing. The techniques discussed in this chapter 
for wording questions, consirucilng questionnaires, and conducting 
interviews are attempts to aid the researcher in approximating the 
twin goals of reliability and validity in his data collection. 

Potentialities of Ih^Interview 

^Vc are concerned here with the interview as a device for col- 
lecting data required to test hypotheses in social research. The 
principles which govern questionnaire design, interviewing, and 
the training of interviewers are, however, relevant to most situations 
in which information is desired from a respondent. Thus, the lawyer 
must interviesv his client in order to represent or defend him; the 
physician must base his diagnosis upon the medical interview as 
well as the examination; the journalist, the personnel officer, the 
social worker— all depend to some extent upon their skills as inter- 
viewers as well as upon those other skills whicli their professions 
demand of them. 

The fact that the interview is used very widely docs not imply 
that it is the best device for «>lleciing social data in all circum- 
stances. One of the choices which the social scientist must make 
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in\oIves the various methods of acquiring data One of the impor 
tant criteria is the relative accessibility of the required data to dif 
ferent means of collection The sources and disposition of family 
income provide an example of data which are at present vn tually 
inaccessible from sources other than personal interview 

Suppose, for example that our research objective is to test 
some hypotheses about the relationship between the source and 
amount of family income and the pattern of saving and spending 
This objective requires that we assemble data on income and ex 
penditures for individual families Although gross expenditures for 
different commodities might be estimated from data provided by man 
ufacturers or trade establishments, and the volume of sasings might 
be determined from banks, tlie pattern of income and expenditure 
of family units cannot be reconstructed from extcrml sources Such 
information is uniquely available through interviews ^sith a sample 
of family units 

On the other hand, there are many data relating to income and 
expenditure which can be obtained witli accuracy and economy by 
means other than the interview survey "We might, for example, wish 
to test the hypothesis that sales of government bonds through pay 
roll deduction plans tend to increase after a plant XMde wage adjust 
ment If this is our research objcctne it is likely that the company 
records, perhaps in combination with those of the Treasury Depart 
menl, can best meet our need for data The aliernatne of interview 
mg relatively lai^e numbers of people in order to identify the fe^v 
bond buyers and then questioning this group about its rate of 
purchase and recent fluctuations in income is costly and complicated 
by comparison 

Another kind of data which has been successfully collected by 
means of interview and personally administered questionnaires 
Ins to do witli the attitudes, perceptions, and behavior of people in 
work situations For example, wc might study the hypothesis tint a 
worker’s motivation to produce will be related to the intrinsic 
satisfaction which he tlcrivts from his job Or wc may hypothesire 
that worker productivity depends ujkiu the individuals perceptions 
of the consequences of high or low productivity, and the extent 
to which these consequences represent personal goals for him Fither 
of these hypotheses requires data which arc ‘ inside the individual 
and which he alone is capable of communicating Anv other ap 
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p’^oach to assessing the indiMduaFs satisfaction ivith his job would 
almost certainly invoUe a nsky process of deduction and inference 
E\en when the research objectives call for information which 
IS beyond the individuals power to provide directly, the interview 
IS often an effective means of obtaining the desired data The studies 
of prejudice and ethnocentnsm by Adorno et al provide an examjile 
of such research (1) The research design called for the rating of 
individuals along a number of dimensions, including anti-Semitism 
and other ethnocentric characteristics, politico-economic conser 
vatism, and several aspects of personality organization* Bias and 
lack of traming make it impossible for an individual to provide 
directly and with validity sudi intimate information about himself, 
even if he is motivated to the utmost frankness But only he can 
provide the data about his attitudes toward his parents, cxslleagues, 
and members of minority groups, from which some of his deeper 
lying charaaeristics can be inferred 

In short, if the focal data for a research project are the attitudes 
and perceptions of individuals, the most direct and often the most 
fruitful approach is to ask the individuals themselves As Jahoda, 
Deutsch, and Cook (11) suggest, observational methods are of pn 
raary value in describing and studying behavior which takes place 
in a controlled situation, in response to known stimuli Observa 
tional methods are less likely to be useful for the measurement of 
attitudes and perceptions and are obviously unable to probe the past 
or to determine an individual s intentions for the future The cn 
teria of directness and economy, and the ability to collect data about 
beliefs, feelings, past experiences, and future intentions hav e widened 
the range of application of the interview The interview, however, 
IS not wnthout its oati limitations 


Ltmttattons of the Interview 

One of the limitations of the interview is the involvement 
of the individual m the data he is reporting and the consequent 
likelihood of bias Even if we assume the individual to be m posses 
Sion of certain facts, he may withhold or distort them because to 

1 In ihu of studies imextiews store u«ed for exploratory purposes 

lor inc development of h)poihcses and instruments and for validation of data 
obtained by wnuen quanonnaircs 
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communicate them is threatening or in some manner destructive 
to his ego. Thus, extremely deviant opinions and behavior, as well 
as highly personal data, have long been suspect when obtained by 
personal interviews. Nevertheless, mucli experience in recent years 
indicates that such limitations on intervieiv subject matter are 
not to be rigidly assumed. 

Another limitation on the scope of the interview is the inability 
of tlie respondent to provide certain- types of information. For exam- 
ple, the hypothesis that paranoid tendencies are related to an inabil- 
ity to work with groups demands some measurement of the re- 
spondent’s personality structure. Although he may be completely 
unqualified to make a direct judgment of such characteristics in 
liimself, he is uniquely qualified to provide some personal informa- 
tion from which an expert might make a diagnosis. Thus, the inabil- 
ity of the respondent to provide a certain datum may mean that a 
different means of data collection is advisable; or it may mean that 
the interview must be so constructed that the respondent provides 
“raw” data which are relatively available and nonthreatening to 
him so that experts may then interpret his responses in order to 
provide the information specified by the research objectives. 

■ Memory bias is another factor which renders the respondent 
unable to provide accurate information. Often, the only clear way 
around the problem of recall is to have the foresight and facilities 
to carry out a research design over a period of time, applying appro- 
priate measurements at the time intervals indicated by the research 
objectives. 


Stimmmy 

In summary, the interview and questionnaire appear as power- 
ful instruments for social research, and the range of their usefulness 
is steadily widening. Individuals* past experiences and future be 
liavior are virtually unobtainable by other means. PcrccjUtons. 
attitudes, and opinions which cannot be inferred by obscrs’ation arc 
accessible through interviews. The major problems in inters iewing 
stem from the inability or unwillingness of the respondent to com- 
municate. These problems, as we have seen, can be surmounteil 
wholly or in part by s'arious means. The skills and technique of 
the iniervicxver, the ingenuitv of the tLit.a-collecting inslrumcni, and 
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the knowledge of the analyst can compensate to some degree foi 
the biases, memory failures, and inexpertness of the respondent. 


PSYCHOLOGICAL BASIS OF THE INTER VIEIV 

In discussing the process of collecting social data, we have 
implied that the questionnaire is a measuring instrument or device 
used by the social scientist in much the same way that specialized 
instruments of measurement are utilized in other fields. The inter- 
viewer is a technician who manipulates the instrument, takes the 
appropriate readings, and records the results. In this sense the inter- 
viewer s function parallels that of scientific technicians in other fields. 
First, he must be provided with a questionnaire which is adequate 
.to the research objectives. Secondly, he must ask the questions and 
record the responses in a standard way. 

^ Considering the inters'iewer as a scientific technician and the 
interviewing process as a scientific technique implies that we are 
able, through the application of a specific instrument in a specific 
manner.^ to achieve identical results in given situations. There is 
further implication that we are able to specify explicitly each step 
which the technician must follow in using the instrument. 

There is sufficient similarity between the scientific technician 

V. ^ **^*®*^*®''’®r *0 make the analogy attractive; however, it does 
not hold completely. If the interviewer were only to ask a specific 
|uestion in a standard way, he would not succeed in obtaining 
of respondents which reflected the same degree 

iL ‘ of completeness, and so on. In short. 

tvicwer cannot apply unvaryingly a specified set of tech- 
n.ques because he is dealing with a vary^ situation. 

stanre xMtth whatever the complexities of the sub- 

for t-ilinfr ■ ''r ^ **'^^*' confronted with the necessity 

the diverge*" ^ ^ccount the defenses, the varying motivations, and 
The with which he is dealing, 

rhe interviewer, on the other hand, must take account of sued, 

MrsueX ^ these, and the measure of 

'^^®^*cwcr is very largely dependent upon the 
which he IS insightful and successful in recognizing and 
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dealing witli tlie social-psychological phenomena of the interviewing 
process. 

Contemporary social science does not provide the interviewer 
with adequate methods for dealing with all the variables at work 
in the interview. To some extent this might be thought of as a 
symptom of the youthful inadequacy of social science in general and 
social psychology in particular. To a considerable extent, however, 
it is also a function of the unusual complexity of the subject matter 
on which the interviewer, as a scientific technician, is exercising 
his techniques. 

Much of the available literature consists of rules of thumb, pre- 
sented as lists of "do’s” and "don’t's” for the interviewer and for 
the questionnaire framer. These do’s and don't's are essentially non- 
systematic compilations of interviewing experience derived from a 
variety of situations over a considerable period of time. One might 
regard them as the "folklore” of interviewing, based on experience, 
and thus as having a good deal of pragmatic utility. They often 
represent practices which have achieved a degree of success in a 
variety of situations. Nevertheless, they have the disadvantage of 
being somewhat unsystematic in their approach to the interview 
process. There are few scientific studies testing these common-sense 
injunctions, and one must accept or reject, without proof, most of 
those interviewing principles which other people have judged to 
work. A final disadvantage of the common sense rules for interview- 
ing is that at best they represent a superficial statement of what con- 
stitutes successful interviewing procedures. They do not help us to 
understand the interpersonal relations between interviewer and 
respondent. They do not tell us why a specific practice makes for a 
successful or unsuccessful interview, or within what range a specific 
practice is desirable or undesirable. 

Until we have a theoretical basis for understanding the inter- 
view process, and until we have tested empirically some of the inter- 
viewing folklore which we frequently take for granted, we arc 
unlikely to advance in a basic way our knowledge and practice of 
interviewing procedures. In other words, we ha\e eveiy reason to 
suspect that wc possess a powerful instrument for collecting research 
data, but we do not yet know its full potentialities and limitations. 

Unfortunately, social science has not yet provided a coniprchen- 



334 Methods of Data Collection 


sive, integrated theory which enables us to understand completely 
the communication process and the interaction between interviewei 
and respondent We can, however, attempt to think through the 
interview process and to identify some of its major psychological 
dimensions In part, we must depend on experience, to some extent 
we can borrow from recent work m communications theory and 
in counseling In this way we can make explicit the basis for the 
techniques which are described later in the chapter Moreover, each 
effort at conceptualmng the variables at work in the interview 
situation will contribute to the theory which is now lacking 


Respondent Motivation 

Why, in the first place does the respondent agree to be inter 
viewed? What are the goals and motivations of the respondent? We 
begin with the assumption that human behavior is goal-oriented— 
that is that an individual behaves in a specific way or performs a 
given act because he perceives such behavior to be consistent with 
certain goals which he wants to achieve With respect to the inter 
view, several motives appear to be relevant A respondent may be 
induced to participate in an interview m connection with a labora 
tory experiment either because he is paid to do so because his own 
scientific interest prompts him lo offer his services, or because he is 
influenced by the prestige of the researcher In recent years, how 
ever, an increasing quantity of research data has been collected from 
individuals who were not thus motivated but who were chosen on 
a wmp mg basis to represent a larger population In such studies 
different motives must be expected Thus, the interviewer often 
approaches a respondent who has little or no previous information 
highly motivated to put forth any 
jE ° ^come a part of that endeavor The problem of attaining 
cient initial respondent motivation to permit the interview 
** undoubtedly greatest in those cases in which the inter 
i ^ approach a respondent who has no previous knowledge 
o e project and who laclu such motives as financial remuneration 
or scientific interest 

In such situations, the respondents first reaction to a request 
or an interview is likely to be a compound of curiosity and adher 
ence to the social norm of minimal politeness This level of motiva 
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tion, although insufficient for an adequate interview, at least enables 
the interviewer to describe the project and permits him to take 
the first steps toward building sufficient motivation to permit the 
interview to move forward. The initial reactions of respondents vary 
considerably among different segments of the population, and these 
variations are reflected in the refusal rates recorded by various sur- 
vey agencies. For example, the higher rate of initial refusals in urban 
areas illustrates a difference between urban and rural populations 
with respect to attitudes toward the casual caller and the appropriate 
behavior toward him. 

For some parts of the population, the customary manner of 
reacting to authority figures may do much to determine the initia 
reaction to an interviewer. Thus, the interviewer may gain access 
to a respondent because he is perceived as an individual or as 
a representative of an agency possessing authority and comman ing 
respect from the respondent. For example, a respondent may react 
favorably to the interviewer because he represents a we nown 
research organization, a university, or a governmental agency. 

The interviewer necessarily accepts these motives as a basis for 
beginning to communicate with the respondent. 
mediately begins to define the situation in a manner w ic 
the interview to certain goals the respondent is suspecte to c 
and, accordingly, gives the interview a positive va ence or 
respondent. , 

In some studies, the relating of the interview to t e respo 
goals may be started in advance of the interview itse y me 
letters, radio, or newspaper announcements. The in- 
duction and statement of the purpose of ^ as 

tended to make the interview appear compatib e wit , or 
a means of achieving, some respondent goal. u/vrint 

After initial acceptance by the respondent, the 
with questions designed to develop active interKt on 
the respondent. These are the kinds of items o ten re . 

"rapport builders." The purpose of such 

the respondent by assuring him that tiie interviete .si ^ujeh he 
-that is, that iJ content is related to inter«ts or goaU whicl.^he 
already has. An additional purpose of these regard 

is to relieve anxieties svhich the respondent is 

to his own ability to play the respondent role elfectisely. This 
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done by educating him in, or clarifying for him, the type of response 
expected, thus demonstrating to Inm his ability to handle the pie 
scribed role If the initial phase of the interview is successful, the 
interview has reached the point where one of two major types of 
motivation may be tapped, thereby ensuring continued cooperation 
by the respondent 

Recent work m communications within small groups has 
resulted in findings which appear relevant to our understanding 
of communication between an interviewer and a respondent These 
findings can be summarized as follows One of the motives for com 
municaiing is the desire to influence, in some manner, the person 
to whom the communication is addressed That is, a person will 
communicate in a given situation if he believes that such com 


mumcation will bring about a change or effect an action which 
he consj^ders desirable (9) In the interview situation, this means 
either that the interviewer will be perceived as a person who can 
nnfAL* directly or that the interview will oe seen as a 

rlinirai* ^ desired change A 

npw oogist, a social worker, or physician is frequently 

w/Drim « “• direct agent of change The 

his finan /tt m ^ communicating hi^ symptoms or 

d ^ perton will beLfit h.m 

viewer ^ direct relationship between inter 

market res^ ''h '■“P®"d<t''' 't provided by the typical 

™nr« cTn f ■" '''"'*■ f«P°ndent beheves that by his 

aaeimm ^ a 'T c ’ ^har 

net in terms n*r^h ^ ” indirectly helping to improve the prod 
World War TI wishes and needs It was common during 

with ‘ You trll f h “ respondent to preface his responses 

Thn tv '" t that f said ■ 

relationshipLre ^pammT T j°P'd “"'f ■£ the following 
the content of the imemew a ''’V“P'>"*"‘ (') The perception of 
The respondent will nm wlitrh he desires 

project lo be related lo his ’P‘’"“"““''>' Perceive every research 
demonstrate this relationsh The researcher must 

resnnndent >>r suiter the results of reduced 

P motivation to communicate (2) The perceotion of the 

interviewer as a oerson ^ u . perception oi tne 
“ person who can bring about change or as the 
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.ep.esentatne of an agency winch .s able to ^nng atom « 
lor example, facts abont perso.tal tneome, 

to obtain in interviews are inorejcath y aval a i,gip 

if the icspondent believes that the infornniion p 

responsible people ‘" ;;‘\f°;""";";;,\oThe‘’ well being of the re 
contribute to the public ^veliarc 

the personal relationship between na.vidiial is motivated 

spondent It can be defined as follows An 

to communicate wttli ^ 'personal rdationship Such 

the communication process and P ^(fers the re 

motivation sometimes occuis becaus „ which he is 

spondent an opportunity to talk a d^quate expression 

interested but which usually do ^ ^ research interview 

This does not imply that , i, ,jiis may be present 

ordinarily obtains cathartic ^ respondent obtains satis 

at times) It does mean, ^ undemanding interviewer 

faction from talking with a rccep » which he is m 

about something in which he is intcrcs motives 

volved The reader will recognize this a^nj 

of patients in the psychotherapeutic i „rounier this motivation 
Interviewers are often surprise , i ^,gs (or desirabilities) 

in a research interview, in which the Experience 

of a therapeutic type of is conducted properly . 

shows, however, that if the resear i i ladonship m many ways 
this motivation is often presen Counselors and therapists 

resembles the counseling relations ‘‘P communica 

have found that freedom of comma under the proper condi 

tion of deep personality conflicts) is p -ualitics which he claims 
tions Rogers, lor instance, counseling atmosphere (18) 

are characteristic of the |ie research interview He 

Three ol these tour are relevant ^^^^,,1 and responsiveness 
characterizes the qualities as, • pvnresses luelf m a 

on the part of the counselor as a person' The 

inieresi in the client and an in regard to expression 

second quality is described as penn jtaiemenis, by tic 

of teehng By the ““"“I"/ ’“'fdpuennl attitude, b, the under 
complete lack of any moralistic J 
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riliicli is able to bring about cliangc, 
iqncseniatise o an .g ^ , income, often considered dirr.cult 

tor example facts abont to the researcher 

to obtain in “fonttation he proffers tvill hel,. 

™Tbk"people in the government develop policies tvhich will 
responsiDte peop ,^ellare and to the well being of the re- 

contribute to the public wtnaic 

spondent. motivation depends more directly upon 

A second in j >P i -nveen the interviewer and the re- 

brc^nTed as An is motivated 

spondent. 1 . can be denned^as gcatification from 

to communitatc „nd the personal relationship. Such 

the communica 1 p ,,ecau5e the interview oilers the re- 

motisation somet 

spondent an d„ not obtain adequate expression. 

inCerested but wh respondent in a research interview 

This does "°“”^jlmrtic release (although this may be present 
ordinari y obtains however, that the respondent obtains satis, 

at times). It does receptive, understanding interviewer 

faction ftoni .‘alking w h a r e 

“’^r'VThe «a®d ^ m reeosnire this as one of the basic motiv: 
volved. The gothcrapeutic interview. 

of patients in the p y surprised to encounter this motivation 

• ""rSerview. in which the possibilities (or desitabi!iti„) 

in a research into ^^i^.ionship appear remote. Experie„r' 

of a therapeutic yp gsearch interview is conducted proncri. 

shows, hosvever that d the pe^, 

this motivation „,;_n*^reIationship. Counselors and therao,',! 
resembles the ®„£ communication (even the commu^^ ' 

have found that fre njeu;) is possible under the proper cn ? 

tion of deep Pf Sifies^ four qualities which ^ ca- 
tions. Rogers, for >" „c.isc counseling atmospberf^^' 

ore characteristic of the intervie' 

Three of these tour .. warmth and responil, ' 

characterizes the qual-es as rcsses itself in,>- 

on the part of the ““"f "ceptance of him as a pe> 
interest m ‘bu.c''™ ^ ..permissiveness in regard loj, ^ 

second quality is described as p c. 


second quality is uc acceptance of his stateme^ ' 

of feeling. By the “/.f^Spnenlal allilude. 

compleielackofanymoralislieorju g. 
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standing attitude which pervades the counseling interview, the client 
comes to recognize that all feelings and attitudes may be expressed. 
No attitude is too aggressive, no feeling too guilty or shameful, to 
bring into the relationship.*’ A third characteristic of the productive 
counseling relationship is “freedom from any type of pressure or 
coercion. The skillful counselor refrains from intruding his own 
wishes, his own reactions, or biases, into the therapeutic situation.” 

Applying these criteria to the research interview, we may con- 
elude that optimum communication takes place if the respondent 
perceives the intemewer as one who is likely to understand and 
accept his basic situation. The interviewer is thereby perceived 
Tccerfh "" «. he « seen as a person who will 

ISrtLai “"d experience This does not 

hCl bu h "7°"^ “e ‘he interviewer as similar to 
Sna oro?b ■ =* capable of under- 

per«p"forwm “7 P"*"' of '■fow. This 

toX ot rLfombf l-Tt •■""‘Viewer’s attitude and 

respondent than on intemewer establishes with the 

orapXran« even r “‘“e 

cues for the respondent.”® ' Provide some initial 

or distort the content"or"th^'^’°f inhibit communication 

For examr Z ^ ■nformation given by the respondent, 
interviewer describes as^tlie^"* "nt accept the goal which the 
providing the federal Purpose of the survey. Thus, the goal of 
distnbutfonmav nn®'’''"'"'"T‘ on income 

they are questioned iT”^*** while to some respondents v/hen 

tion. Even more frequently 'thTie”'*'" “"d economic situa- 

are in conflict with the ^pondent may possess goals which 

Piant. for e “m;;e a wS ma1 t ‘77™ 
notion that emnlovee on.’n’ c'' "holly sympathetic to the 
hopeful that frank exnre^ • should be solicited and he may be 
•ions. However, he CvTo iving certain situa- 

opinions may be dangerous^nd ™ ‘7 7 “P‘“don of critical 
or discrimination He *riake him liable to retaliation 

loss of promotion concerned over loss of work or 

an intervierbut it dt r 

■t hmtt the content areas about which the 
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,111 min Iraly. b! 

ss r “ ; “r “ 

interpreted as criticism of Itis immediate * P . ^ jnay distort 

]^nstasarespor.der.tmay refu-^ iPe 

his communication because be J restrict communication ^ 

interview process, he may inter\’icwer is such that 

because the personal relationship svjt ' imoossible. This kind of 
real communication and understan mg ^ stereotyped judgment 

respondent reaction may occur as a resu i-espondeni per- 

which he makes of the interviewer. difference between him- 

ceives a gap of education or an the interviewer is 

self and the .interviewer, he may . family circumstances 
incapable of understanding nredicament. This problem 

or of empathizing in any w^y wit * the respondents ave 

in communication is very likely to content area o • 

some perception of themselves as extremely radical 

interview. For example, a r“P°"t ‘ie«er as necessarily differmp 

political views miglit perceive the m ^ oi view 

so greatly from himself that no has in elfect cone 

possible. In such instances, the r P of im 

that the interviewer is outside t le a complete an 

on the topic in question. The po 

interview is, therefore, remote. ;,jiervicwer and the 

The relationship between ‘ ‘"„„,„unicated is ^ 

and the character of the ‘ pondent's stereotypes. e 

determined exclusively by the importance “ 

a number of ="‘^‘“/^/*,1netermining the ‘„.^nner 

... » - ; 

in which the interviewer many of these situation jnay 

safe to assume, however, that, a prescribed ec 1 

reHect.the interviewer’s '^''“^interpersondl relations be u , 
some are caused by a failure o .merP 
■ interviewer and respondenn the 

according to his stereotype j stereotypes _ 

interviewer may be guided ^ The .nterv.e^^cr may 

the objective characteristics o 
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lor example, set up inaccurate working hjpotlieses as to how the 
respondent will respond and then, without awareness, guide or dis 
tort the responses into ihe anticipated channels (10) 

In the two following sections of this chapter, we shall discuss 
those principles of questionnaire construction and interviewing tech 
nique which in the light of present experience are most likely 
to maximize a respondent's motivation to communicate and which 
help the interviewer to avoid the kinds of inhibiting or distorlin.. 
factors which have just been described 


DESIGN OF THE QUESTIONNAIRE 
Dml Purpose o/ the QuesUontiaire 

purnlst ■"•erview schedule, serves two majoi 

queLons the an Uw research objectives into specific 

wuhe •» 

In order m achieveThrn°"^' ’’y **’' ’'“““''cI' objectives 

respondent the i.le Porpose, each question must convey to the 
lives and each ^ 8™op of ideas required by the research objec 

anaKzedl Tha, "'I ^ 

the q^iestion muo '' objective Moreover, 

tion of the resDonfe'^'^ functions with minimal distor 

of the ^espon^^e„.,^:e^tumXTh^^^^^^^^ - askiiqj a question 

i'here?ore,°LTZucm7:^ questirrhourd" 

and completely reflects each ro” '"'l"' “ "''‘o’’ nocurately 

The sem„Vc " respondents position ' 

vewer in m“;;lt"e‘:hh r“' ^ «-onnaire is to assist the inter 

information There are m ^^Po"^ont to communicate the required 
ent s willingness to enea7''in'*”'’" ‘’'=“™‘no the respond 

mentioned In- motivaf.n ^ .u ‘niervie^v, as we have already 
viewer are of ereatimnorf^ ^ '"espondent, the skills of the inter 
does much to determine but the questionnaire itself 

relationship and ^ character of the interviewer respondent 

collected ' fioontly, the quantity and quality of the data 

Since the questionnaire is constructed on the basis of the re 
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seirch objectives, it is clear that constructing the questionnaire can 
not be tlic first step in undertaking a research project The statement 
of the research objectives and ilie specifit ition of the data required 
to meet those objectives must precede (jucstionnaire construction 
The sequence of steps m planning a researcli project is discussed 
in some detail in Chapters 1 and 2 It will be sufficient here to pro 
vide an example of the process by which a research hypothesis 
iletermines questionnaire content 

Suppose that, as part of a stud) of liow behavior is influenced 
b) mass persuasion, we have the hypothesis that the number of 
government savings bonds purchased is related directly to the 
amount of direct personal solicitation What data are required to 
test this hypothesis’ Wliat questions should be asked to elicit these 
datar 

In the present example, the investigators decided that two 
approaches should be employed one direct and one indirect The 
direct approach consisted of asking recent bond purchasers what 
factors had led them to buy The indirect approach during a later 
portion of the same mtcnievv led to inclusion of a number of ques 
lions concerning the respondents recent exposuic to such influences 
as newspaper advertisements, radio other group appeals and in 
dividual solicitation In analyzing the obtained data the researchers 
sorted those respondents who were comparable with respect to 
income and other demographic characteristics into groups according 
to the frequency and type of solicitation which they had experi 
enced The buying behavior of these groups was then studied audit 
was found that buying behavior was closely related to the presence 
or absence of personal solicitation (7) 

In this example one can see how the questionnaire design flows 
logically from the specified research objectives and must anticipate 
the analysis of the data Thus construction of the questionnaire 
IS an integrated step in getting a research project into operation 

As noted above, the second function which the (juestionnaire 
must perform is to assist in creating conditions under which the 
respondent will communicate fully and freely Research workers are 
by no means agreed on the techniques by which this can best be 
achieved (12) A systematic methodological research on the inter 
viewer respondent relationship has recently been completed by the 
National Opinion Research Center, and the final report is now in 
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the process of publication (10). The experience of the present writers 
argues that the characteristic of “respondent orientation" is of 
primary importance in maximizing communication. The concept 
of respondent orientation as a characteristic of the questionnaire 
has an obvious analogy in the clinical concept of client-centeredness 
(17). The comparison is more than a superficial one. The question- 
naire makes use of some of the techniques and to a small degree 
serves some of the purposes of client-centered therapy. The dilTer* 
ences between the research and the therapeutic interview are many, 
of course. In the research interview, content is determined primarily 
by research objectives rather than by the respondent’s needs; simi- 
larly, the pace and sequence of questions is for the most part beyond 
the controh of either the respondent or the Interviewer. Even the 
decision to engage in the interview is not wholly of the respondent’s 
volition. In spite of these limitations, there are a number of respects 
in which the interview may be oriented toward the respondent, and 
the questionnaire so constructed that the needs and reactions of the 
respondent are tak^n into account. 

The preceding pages have presented the major criteria by .which 
a questionnaire might be judged. However valid these criteria may 
be, they do not solve the specific problems of question wording and 
question sequence which confront every social scientist who uses the 
interview. The remainder of this section is devoted to a discussion 
of just such specifics, a discussion which attempts to develop the 
dos and don t’s” of questionnaire construction out of the major 
purposes which the questionnaire must serve. The topics included 
y no means exhaust the subject. More detailed treatments have 
been made by Payne (16), Parten (15). Cantril (6), Blankenship 
(5), and others. The problems discussed here are those which ap- 
peared most relevant in terms of the criteria cited and, therefore, 
most important in creating an adequate instrument for collecting 
social data by means of interviews. 

^ Language , 

I*' construction of a questionnaire, the primary criterion 
t le choice of language is that the vocabulary and syntax should 
offer maximum opportunity for complete and accurate communica- 
tion of ideas between interviewer and respondent. Not only should 
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the words cRosen be within range of the respondent’s vocabulary, 
but also his colloquialisms and cliches should either be known and 
used meaningfully or avoided. Most simply stated, the language of 
the questionnaire must approximate the language of the respondent. 
In most cross-section surveys, one strives for simplicity, and properly 
so. For a study of physicians or lawyers, however, different and 
specialized vocabularies would be more appropriate. To the extent 
that the respondent population is not homogeneous, compromise 
is unavoidable. In such cases the solution of the problem consists in 
using language which communicates successfully to the least sophis* 
ticated of the respondent population and, at the same time, avoids 
the appearance of oversimpIiRcation. 


Frame of Reference 

To say that a questionnaire should be cast in the language of 
the respondent is relatively unequivocal and straightforward. It is 
equally important, and considerably more difficult, however, to 
phrase questions which take account of the frame of reference 
which respondents bring to the subject under discussion. Neverthe* 
less, the questionnaire must introduce each topic in a form which 
ties into the perceptions of the respondent and is consistent with 
the respondent’s notions of what is and is not salient to the topic 
under discussion. The development of a topic from one question 
to another must not only meet the researcher’s criteria for reason- 
ableness and logic; it must also meet those of the respondent. Thus, 
frame of reference becomes another dimension in which the re- 
searcher must begin at the point “where the respondent is”— it 
must be respondent-oriented. 

Bancroft and Welch (2) present an example of the effect of the 
respondent’s frame of reference on his replies. They found that the 
series of questions used by the Bureau of the Census to ascertain 
the number of people in the labor market consistently underesti- 
mated the number of employed persons. When asked the quesdon 
“Did you do any work for pay or profit last w’eek?” respondents 
reported what they considered to be their major activity. Young 
people attending college considered themselves to be students, even 
if they were also employed on a part-time basis. Women who cooked, 
cleaned house, and raised children spoke of themselves as house- 
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wives, even if they also did some work for pay ouisidc the home 
The effect of the respondent’s frame of reference was to classify as 
nonvvorkers many thousands of people who met the census definition 
of workers. The solution involved revising the sequence of interview 
questions, beginning with the acceptance of the respondent's clas..i- 
fication of himself. Thus, people were asked first what their major 
activity was. Then, those people who gave "nonworker” lesponses 
were asked whether, in addition to attending school or keeping 
house, they did any work for pay. The effect of this change was to 
raise the official estimate of employment by more than one million 
persons. , 

The respondent’s frame of reference may also be important 
in determining whether he will be willing to communicate a given 
piece of information. He may be reluctant to communicate if he fails 
to see the relationship between a question and his own perception of 
the research objectives. Thus, a survey respondent who has talked 
freely about foreign policy may suddenly balk at being asked his 
age or education. Although these arc unlikely to be threatening 
questions, they do not fit his perceptions of the research needs. The 
collection of data on family income, cited earlier in this chapter, 
provides another example of the extent to which respondent be- 
havior may depend upon his perception of what is relevant. The 
collection of detailed data on persona! income, unsuccessfully 
attempted in many early studies, was achieved by introducing a 
request for income data as part of a program to assess, and perhaps 
ultimately to solve, problems of consumer credit, spending, and 
saving. In the context of discussing savings, plans for consumer pur- 
c ases, and attitudes and expectations about economic and personal 
nancial status, the question of family income appears to the 
respondent as reasonable and relev-ant. 

Infoimation Level 

, A question must be worded so that it ties into the respondent's 
present evel of information in a meaningful way. No unrealistic 
assumptions should be made about the expenness of the respondent 
or the amount of information he possesses. The importance of this 
rule for questionnaire construction lies in the fact that when the 
interviewer, with the authority of his role, asks the respondent 
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a question, there is an implication that the respondent should be 
in possession of an adequate answer and that if he cannot answer, 
he is somehow discredited. If, for example, in a questionnaire deal- 
ing with public attitudes toward problems of atomic energy, an 
interviewer asks a respondent, “What precautions are appropriate 
for a technician handling radioactive isotopes?” an immediate and 
very comm'on respondent reaction will be embarrassment and resent- 
ment at being asked a question which he is unable to understand. 
Not only will the researcher have lost the answer to a question, 
but he will also pay a heavy price in terms of decreased motiva- 
tion to communicate. Another possibility, of course, is that the 
respondent, feeling obligated to show his knowledge, will pretend 
to competence which he docs not possess. 

The importance of asking questions appropriate to the respond- 
ent’s level of information, and not productive of respondent embar- 
rassment, docs not necessarily limit us to asking questions to which 
every respondent knows the answer. It does mean, however, that 
caution in wording questions must be used wlicn wc anticipate that 
a considerable proportion of respondents will not be in possession 
of the answer. For example, the question quoted abo\e might be 
preceded by a statement such as “Many people haven’t had an oppor- 
tunit) to learn a great deal about the technical problems of handling 
atomic material, but some have picked up information on this 
subject. Do you happen to know . . . ?” 

This problem is sometimes referred to as expert eiror— iliat is, 
the error of ascribing to the respondent a degree of expenness in a 
particular field which he docs not actually possess. These "expert” 
questions may require the respondent to engage in uncomfortable 
self-analysis, to be verbal about material which is really unanalyrcd 
or unverbalized and therefore not consciously available. Suppose 
wc ask an industrial cmplo)ce, “What is the state of your morale, 
and what is the reason for your feeling that way?" It would be as if 
a doctor asked a patient for the name and cause of his disease, rather 
than asking for the patient’s symptoms from which the nature of tlie 
disease may be inferred. 


Social Acceptance 

Another cluaracteristic of the respondent-centered questionnaire 
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is its emphasis on the acceptability of a wide range of responses. 
No question should confront the respondent with the necessity 
of giving a socially unacceptable r«ponse. If we expect the respond- 
ent to answer freely and spontaneously, we must help him to feel 
that the entire range of possible responses is acceptable— acceptable 
not only to the interviewer but also in terms of the respondent’s own 
standards for himself. For example, if after a presidential election 
we wish to ascertain who in the population did and who did not 
vote, we find ourselves in the position of asking respondents about 
a situation in which they may perceive only one socially acceptable 
alternative. The civic-minded, responsible citizen voted; therefore, 
the respondent voted, or at least should have voted and does not 
wish to be put in the position of telling the interviewer that he failed 
to do so. This hesitancy can be overcome, at least in part, through 
question wording. For example, some form such as the following 
might be used: “You know, in the last election about half of the 
qualified voters actually got to the polls and about half were unable 
to; did you happen to vote?” 

Offering a range of responses which meets the respondent’s 
criteria of social acceptability Is necessary to good question formula- 
tion. A broader statement might be that the question must never 
constitute a threat to the respondent’s ego. Such a threat may be 
introduced if the respondent is required to give an answer which 
he feels is socially unacceptable, or it might come about if the 
respondent is placed in a position where he feels less well informed 
than he should be. 


Leading (luestions 

should be phrased so that they contain no suggestion 
^ appropriate response. For example, a question de- 
^ general attitudes toward rent control might read, 
whirh ^■ent control?** A form of the same question 

which IS obviously biased might be. “You wouldn’t say that you 

This kind of bias is so 
tiibtil avoid it almost without effort. A more 

arA in f ^ t 'wording might be, ‘’Would you say that you 

are in favor of rent controlP’-.-ThU question makes it easier for the 
respondent to answer "yes” than "no.” In answering “yes” he is 
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merely agreeing with the language of the question. It is more 
difficult to respond, “I disagree,” since this response seems to contra- 
dict the interviewer, or at least goes counter to the ideas of the 
person who worded the question. 

One way in which a question may suggest a positive or negative 
answer is through the use of words which have become emotionally 
“loaded,” either favorably or unfavorably. In our culture there are 
many w'ords so affect-laden that it is virtually impossible to expect 
a respondent to give a response to the concept behind these words. 
For example, prior to World War 11, “Nazi” had already become 
an emotionally colored word. As a result, one obtained very differ- 
ent responses to a question referring to "Nazi Germany," rather 
than simply to “Germany.” 

Another way in which a question may encourage a particular 
response is by associating one of the alternative responses with a 
goal so desirable that it can scarcely be denied. Thus, the question 
“Do you favor or oppose higher taxes to prepare for the dangers 
of war?” associates higher taxes with defense against attack and 
implies that a negative answer reflects indifference to the menace 
of attack. Even if the respondent is permitted an unstructured reply, 
the question does much to bias his answer. If he is given only the 
alternatives of acceptance or rejection, as in the typical public- 
opinion poll, the biasing effect of the question is even more serious. 

A “loaded” question is not necessarily undesirable and often 
has a real place in the questionnaire. The problem is to avoid load- 
ing if one is looking for an undistorted response. The following is 
an example of a strongly loaded question which was purposely used 
in one study. “Would you favor sending food overseas to feed the 
starving people of India?" In this case the question followed a 
series of unloaded questions and was used to determine the number 
of people who were so strongly against shipping food that they 
rejected the idea in spite of the strong emotional context of "starving 
people.” 


The Single Idea 

Questions should be limited to a single idea or to a single 
reference. The problems encountered in this area are illustrated 
in the following question: “Do you favor or oppose unemployment 



348 


Methods of Data Collection 


insurance and pension plans^ Many answers to this question would 
not permit the researcher to determine whether the respondent 
IS answering one or both of the items mentioned in the question 
The most acceptable formulation of a question in this area would 
depend to some extent on the specificity of the research objectives 
If the purpose is to find out the respondent’s attitude toward pen 
sions and unemployment insurance specifically, it would be neces 
sary to ask two questions, one referring to each of the two proposals 
If on the other hand the purpose of the question is to get some 
notion of the respondent s general attitude in the area of worker 
benefits it may be possible to ask a global question such as “Do you 
fa\or or oppose such worker benefits as unemployment insurance, 
pension plans and the like? It must be kept in mind, however, 
that if a global question of the latter type is asked, the interprcta 
tion must be very consers alive In other words, a positive response 
to a global question must be taken only to indicate favorableness 
m the general area and cannot be interpreted as indicating respond 
ent support for any of the specific examples cited 


Question Sequence 

Aside from the wording of individual questions, the researcher 
needs to give thought to the arrangement of the questions in a 
questionnaire At several points in this chapter we have discussed 
le concept o respondent orientation which is also important in 
Thus, questions should be so ar 
the KPn ^ V sense to the respondent— that is 

nf f “ q““Uonnaire should follow the logic 

questions which are associated 
thp nii«t tie analysts point of view are widely separated in 
Drimar 1 sequence of questions should be determined 

A wH^L, P^ot^ess rather than the research process 

resDonrlf'nf^"^ questionnaire facilitates the easy progress of the 
next often leads him to anticipate the 

TJiP «p ” ecause It seems to him the logical topic to discuss 
ratipH th questions may also be determined by s\hat is 

approach This refers to a procedure of asking 
g era or the most unrestricted question first and follow 
in„ It with successively more restricted questions Thus in the 
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sequence of questions the frame of reference is gradually narrowed 
by asking more specific questions. The purpose of the funnel 
sequence is to prevent early questions from conditioning those 
which come later and to ascertain from the first open questions 
something about the respondent’s frame of reference. The following 
series of questions illustrates the funnel approach: 

Question 1: How do you tliink this country is getting along in 
its relations with other countries? 

Question 2: How do you think we are doing in our relations 
with Russia? 

Question 3: Do you think we ought to be dealing with Russia 
difFeremly than we are now? 

Question 4: (If yes) What should we be doing differently? 

Qulstio.n 5; Some people say we should get tougher with Russia 
and others' think wc are too tough as it is: how do 
you feci about Ur 

The reader will notice that tha. first question is very general in 
its approach. It does not e^^tablish a frame of reference— a trend of 
tliought— in regard to the country under discussion in terms of 
diplomatic' relations or of economic relations. It permits the re- 
spondent great freedom in discussing the topic. From the answer 
to the first question, we can probably infer the frame of reference 
of ilie respondent. In the second question we have restricted the 
area to one country, Russia. The third question is aimed at the 
respondent's opinion ol how the United States ought to deal with 
Russia, and the fiflli becomes very specific by asking whether we 
should exert more pressure or be more lenient. If, for example, 
Question 5 had been asked any earlier in the sequence, it might 
well have conditioned the answers to the otlier questions. The 
funnel technique is, therefore, often very helpful in avoiding the 
distortion of a question by those svhich precede it. It enables us to 
analyze the frame of reference svhich the respondent is taking, and 
It enables us to get a general aflcclivc response before pinning the 
person down to more specific points. 

The first tss'o or three questions in a questionnaire often liave 
a dual function. On the one hand, they arc includetl to obtain 
information on specific research objectives, but iliey also help to 
educate and motivate the respondent. In m.iny instances the re- 
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spondent may not know the kind o£ responses expected of him- 
that is, whether a one-word answer will suffice or whether he is 
being asked to discuss the subject in detail. He may wonder about 
being cross-examined, or he may be simply confused about the 
interview demands regardless of the initial instructions. During the 
first two or three questions the interviewer, by his probing, his reac- 
tions to responses, and his general behavior toward the respondent, 
educates the respondent in the role which is expected of him in the 
interview. 

In addition to their orienting or educational purpose, the first 
questions also serve to motivate the respondent to participate more 
thoroughly by involving him in the topic under consideration. The 
first few questions may, in fact, set the tone of the entire interview. 

Often a questionnaire covers more than one general topic. This 
may create difficulties in the interview, since the respondent has to 
be helped to change his frame of reference from one topic to the 
next. One efficient way to help the respondent orient himself to a 
new area of discussion involves the use of transition statements or 
transition questions. Such a statement might be, “Well, we’ve been 
discussing our relations with the Far East; now we want to talk a bit 
about the way things are going tor us In Europe. How do you feel 
about our relations with the countries in Europe?” Statements of this 
sort help the respondent to shift gears and to transfer his attention 
to a new area of discussion. 


The Form of the Question 

Thus far we have discussed the woiding of questions without 
considering the problem of the form of the response— that is, whether 
the respondent is to reply in his own words or whether he is to 
select from a series of preassigned categories the response coming 
closest to his osvn opinion. Questions of the former type are termed 
open or “unrestricted”, the latter type of question is “restricted 
or ‘closed.’’ The open question is one in which the topic is struc- 
tured for the respondent but he is given the task of answering in 
his own words, structuring his answer as he sees fit, and speaking 
at whatever length he desires. An example of an open question is, 
“How do you feel about Negroes and whites working together in 
this factory?” In the closed question the possible responses are 
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contained in the question so that the respondent merely has to 
select the category which comes closest to his position. An example 
of a closed question is, "Do you think your income will be higher, 
lower, or about the same this year as it was last year?" 

Generally speaking, the closed question is well adapted to situ- 
ations in which (1) there is only one frame of reference from which 
the respondent can answer the question; (2) within this single frame 
of reference, there is a known range of possible responses; and 
(3) within this range, there are clearly defined choice points which 
accurately represent the position of each respondent. Two examples 
will help to clarify these points. The first is the classification of 
respondents by marital status. In this case, there is a known range 
of possible responses: a person is either single, married, divorced, 
separated, or widowed. Within this range, the choices are clear and 
the question has but one frame of reference for all respondents. 
Here the closed question is desirable and can be worded, "Are you 
single, married, divorced, separated, or widowed?" The respondent 
merely has to select the response which defines his marital status. 

Another example of the dosed type is the question "Would 
you say your present income was higher, lower, or about the same 
as your last year’s income?" In this question the respondent is asked 
to compare two facts which are known to him. The frame of ref- 
erence is limited to an income comparison for two years, and the 
choices are clear. 

Crutchfield and Gordon have provided an excellent documen- 
tation of the effects of using the closed question improperly (8). 
The following question was asked on a national survey: "After the 
war would you like to see many changes or reforms made in the 
United States or would you rather have the country remain pretty 
much as it was before the war?" The answers indicated that the 
majority of people wanted things to remain as they were. A follow-up 
was made in which the same question was asked and nondirective 
probes were then used to ascertain what the respondents were 
concerned with when answering the question. The responses showed 
that respondents answered from seven frames of reference. Some 
were concerned with domestic issues (employment conditions, stand- 
ard of living, etc.), some with technical improvements (better 
transportation, communications, etc.), others with political affairs, 
and so on. Since the original researcher was unaware of the varying 
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frames of reference, his interpretation of the findings was (juite in 
error. This illustrates the danger of using closed questions when 
more than one frame of reference is possible. 

Let us consider another example: "At the present time,,ido 
50 U think personal income taxes arc too high, too low, or about 
right?" The alternatives within the question are incomplete and 
fail to allow for the person who feels that the income taxes are fair 
at certain income levels but rre unfair at other levels. Neither are 
the alternatives adequate for the person who feels that taxes are too 
high because of heavy government expenditures, but that if the 
expenditures continue, then taxes must remain high. Thus, .for 
many people the alternatives presented do not include choice points 
which closely approximate their attitudes. Such people are forced 
either into discussing the topic more fully with the interviewer or 
into selecting a category which is a poor approximation of their 
position. 

The open question has many advantages stemming from the 
fact that the respondent is encouraged to strucitirc his answer as he 
wishes. The technique provides a means of obtaining information 
which cannot be obtained adequately by use of a closed question. 
For example, it permits the respondent to state his own frame of 
reference when this is desirable. The potentialities of the open 
question for discovering motivation have been ingeniously explored 
by Lazarsfeld (14). 

Another advantage of the open question is the information 
which the answers indicate with respect to the respondent’s level 
of knowledge or degree of expertness. If the respondent has been 
led to discuss his opinions of the Atlantic Charter, one is able to 
analyze not only his attitude but also his level of information. 

The relatively free interchange between interviewer and re- 
spon ent which is characteristic of the open question permits the 
mter\iewer to discover whether the respondent clearly understands 
the question which is being asked of him. On the other hand, once 
the respondent has selected one of the proffered alternatives to a 
closed question, the interviewer can assume only that the respond- 
ent understood the question and-chose the alternative which best 
approximated his own position. i 

Another difference between the open and closed question is 
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encountered in coding responses. In both types the respondent's 
attitude or perception must be categorized. If the question is closed, 
the categorization is made by the respondent or the interviewer. 
If the question is open, the categorization may be made by the 
interviewer; it is usually considered preferable, however, to have 
tlie responses coded at some central place by people trained for 
this work. Each of these coding procedures has advantages and 
disadvantages, wliich are discussed and evaluated in Chapter 10. 
For a more detailed discussion of open and closed questions and 
their respective uses, the reader should see Lazarsfeld (13). Lazars* 
feld contends that the methods can be combined effectively if the 
open question is used in an elaborate pretest followed by closed 
questions in the main study with open questions used in a follow-up 
of critical cases. i 

The Pretest ’ ’ 

No matter how astute the researcher has been in wording his 
questions and designing his questionnaire, he needs to try them out 
with respondents before launching into the actual field studies. The 
pretest is, in a sense, a miniature study in itself. The first function 
of the pretest is aimed at testing the questionnaire from the research 
point of view. The interviews should be analyzed to see whether 
the responses fulfill the research objectives. Some of the researcher’s 
"best questions" often fail to elicit the type of response which meets 
the objectives. An analysis of these trial interviews in relation to 
the objectives will increase the probability of fulfilling the research 
objectives. Often the pretest calls for major revision of the questions, 
and several pretests are required until a workable questionnaire is 
achieved. 

A second objective of the pretest is to determine the extent to 
, which the questionnaire meets the criterion of respondent orienta- 
tion in all its aspects. Does the questionnaire promote the appro- 
priate relationship with respondents? Do respondents understand 
the questions? Can the questions be asked without having to be 
explained or reworded? There arc no exact tests for these charac- 
teristics. The help of experienced interviewers is most useful at tins 
point in obtaining subjective evaluations of the questionnaire. 



354 Methods of Data Collection 


PRINCIPLES OF INTERVIEWING 

\/ 

The preceding section dealt with the instruments of data col- 
lection. This section will discuss the specific techniques which the 
research interviewer uses. The techniques proposed are a systematic, 
well-tested set of procedures which are consistent with the principles 
of communication discussed earlier in this chapter. 


The Introduction (o the Interview 

The first step is often the most difficult for the interviewer, 
because at the initial contact the respondent must be motivated to 
permit the interview. Ordinarily the interviewer will follow a 
sequence of procedures approximately as follows: 

1. Explain the purpose and objectives of the research. 

2. Describe the method by which the respondent was selected. 

3. Identify the sponsor or the agency conducting the research. 

4. State the anonymous or confidential nature of the interview. 

In the early phases of the interview, the interviewer plays one 
of his most important and one of his most autonomous roles. It is 
difficult to describe the precise acts which an interviewer should 
perform in order to provide adequate motivational basis for the 
resporident to communicate the information which he seeks. The 
establishing of rapport is clearly not a scientific procedure in the 
sense of being capable of objective statement. It is rather a skill 
which depends primarily on the know-how, experience, and sensi- 
tivity of the interviewer. It is this function of the interviewer which 
ma es great demands on the qualities of clinical insight and 
intuition. 

We have already mentioned that the forces leading a respondent 
to communicate can be thought of in terms of a meahs-end or 
pat goa sequence in which the respondent gives information be- 
cause e sees the information-giving process as a means of attaining 
some goa which he considers desirable. Secondly, the respondent 
IS motivated to give accurate and complete information as a means 
of attaining some satisfaction out of the relationship with the inter- . 
viewer. Thirdly, the respondent communicates in the interview 
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situation only in the absence of certain specific kinds of barriers to 
communication. 

^Vhen the interviesver first faces a respondent, he finds that the 
relationship has some structure even before a word is spoken. On 
the one hand, the respondent probably will be polite enough to let 
him talk; on the other hand, there may already exist certain barriers 
which must be overcome. For example, the public-opinion inter- 
viesver is frequently mistaken for a salesman by the respondent. 
Another barrier to communication arises from the respondent's 
perception that granting the iniervicsv will in some way make him 
vulnerable. This is essentially a problem in reassuring the respond- 
ent with respect to the anonymous or confidential nature of the 
interview. A third barrier comes from a rather frequent respondent 
perception that the interview may be intended in some subtle way 
to check up on him or his activities. Handling this sort of problem 
calls for a convincing explanation by the interviewer of the purpose 
of the study and particularly of the method by which the respondent 
was selected 

The positive motivation in terms of the respondent’s goals 
comes from a careful statement of the purpose of the research. The 
interviewer tries to sense the respondent's wishes or goals with 
respect to the interview process and, having appraised these, to 
explain to the respondent how the interview relates to them. For 
example, an interviewer working on a study of public attitudes 
toward current matters of foreign policy might come upon a re- 
spondent who, on hearing the purpose of the survey, says to the 
interviewer, “You don't tvant to talk to me about foreign policy. 
What 1 thissk about those in the State IJepat roettt viould 

curl their hair. You’d better find somebody who is a more agreeable 
type." The interviewer would assure the respondent that the pur- 
pose of this study is not to find out simply the opinions of people 
who are endorsing current foreign policies. He would emphasize 
that the interview provides an opportunity for the respondent to 
register his critidsm in a place where it might have some effect on 
public officials svho sincerely svant lo learn the general public atti- 
tude, whether it is critical or appreciative. 

In some researches the respondent goal is rather clearly per- 
ceived, such as in the case of the worker who is asked to participate 
in a study which may result in better working conditions or higher 
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pay In other studies the respondent goal is more obscure In a 
laboratory research problem, for example, the respondent may 
gam only the prestige of participating in a scientific or official 
endeavor 

Another motivation the interviewer should tap comes from the 
personal relationship which he builds with the respondent In part, 
this relationship depends upon the interviewers being perceised 
as a desired agent of communication or change However, as in 
the therapeutic interview, the qualities of acceptance, understand 
mg, and receptivity seem to ha\e inherent \alue> for the respondent. 
Some evidence for the importance of the interviewer respondent 
relationship was obtained by the Survey Research Center when 
respondents who had been interviewed about tlieir incomes, savings, 
and buying plans were polled by mail about their reactions to the 
interview Their replies were more often couched m terms of the per 
sonal relationship and personal qualities of the interviewer than m 
terms of the content of the study or the apparent purpose of the 
inquiry Typical comments mentioned the fact that the interviewer 
was a very understanding person or that the interviewer had a keen 
insight into the respondent's situation 

Often the interviewers' contribution to respondent motivation 
IS referred to as ' rapport ' The term has come into common use 
and indicates an increasing sensitivity on the part of researchers 
to the importance of the interviewer respondent relationship At 
times the use of the term suggests a superficial approach to respond 
eni motivation Thus, rapport is referred to as if it were some 
tangible quantity or some specific task which was to be gotten out 
o the way early in the interview as a preamble to getting on with 
the mam business of data collection There is the implication that 
a ter the interviewer has said. Good morning' and inquired after 
tie ealih of the respondents family with the properly solicitous 
in ection, he can ignore the relationship with the person giving 
ini ata Contrary to the implications of this approach, rapport 
IS not something which is "plugged in ' at the beginning of the 
interview m order fo get it off to a good start Rapport refers to 
t e atmosphere or limate of the entire relationship between re 
spondent and interviewer 

Although rapport, or the climate of the interviewer respondent 
relationship, has yet to be reduced to quantifiable factors, we can 
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distinguish among interviewing situations according to the 
“amount’' of rapport which they require. Thus, it would be possi- 
ble for an interviewer to db a very acceptable job of asking the 
two or three simple demographic questions associated with the 
typical school census without having established any relationship 
with the respondent beyond that implied in a civil “Good morning” 
and a display of credentials. On the other hand, if the interviewer’s 
task is to obtain information about some aspect of the respondent’s 
habits— marital relations, for example— it would be necessary for 
him to establish a deeper kind of personal relationship with the 
respondent. In general, we can say that the more intimate, emo- 
tionally charged, or ego-involved the topic of the interview, the 
more delicate the job of establishing the relationship with the 
respondent becomes, and the deeper that personal relationship 
must be. 

When we refer to a deeper or a closer personal relationship as 
appropriate to certain kinds of interviewing, the things we have 
in mind are those associated with such words as warmth, acceptance, 
understanding, tolerance, and the like. We are not suggesting that 
the interviewer can do a more effective job if he is closely involved 
with the respondent’s activities. Thus, for example, we are not 
suggesting that a close friend or near neighbor is the ideal inter- 
viewer; on the contrary, the ideal interviewer-respondent relation- 
ship seems to be one in which the interviewer achieves a consider- 
able degree of closeness in terms of understanding and acceptance 
but at the same time retains the detachment or objectivity which 
we associate with a professional-client relationship. 


Asking the Questions 

The interviewer's job of asking questions from the question- 
naire is comparable to the scientific technician’s role of applying a 
measuring instrument in a standard manner. It is through the use 
of carefully worded questions transmitted to the respondent ver- 
batim that wc achieve much of the standardization in the interview. 

The major aim in putting questions to a variety of respondents 
is to have those questions so sv’orded that their psychological value 
b equivalent for each respondent. There arc infinite difTercnces 
among respondents, and it is not possible to vzxy a question so that 
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it has the same psychological impart for each. Since, therefore, we 
cannot tailor the question for each respondent, the best approxi- 
mation to a standard stimulus is to word the question at a level 
which is understandable to all respondents and then to ask the 
question of each respondent in identical fashion. This, then, is 
the function of the interviewer in using the questionnaire as the 
stimulus. The only instance in which the interviewer is permitted 
to vary this procedure is when an individual is unable to under- 
stand the question as worded. Even in such cases, the interviewer 
is encouraged to repeat the question verbatim before explaining it. 
In many cases, apparent lack of understanding is a matter of atten- 
tion fluctuation rather than inability to grasp the question meaning. 
In such instances, a simple repetition of the question will suffice. 

Except for these minor variations, the interviewer’s role with 
respect to the questionnaire is to treat it as a scientific instrument 
designed to administer a constant stimulus to a population of re- 
spondents. This technique is necessary when quantifiable data are 
desired. In some research of an exploratory nature, or where sub- 
jective analysis is contemplated, the interviewer may be permitted 
much more leeway in the use of the questionnaire. In some research 
he may tailor his questions to each respondent, with the researcher 
indicating only the areas to be investigated. Where quantifiable 
data are needed, however, the more rigid use of the questionnaire 
appears necessary. 

/ 

St/mti/aling Complete Responses . 

In many cases the use of the question evokes a response which 
is incomplete or which is unclear. The interviewer must have some 
technique which will enable him to stimulate the respondent to 
further verbalization. Moreover, he must achieve this without sacri- 
ficing standardization. For example, if a question is asked of all 
respondents. \nc have, so far, compntabiliiy. H at this point ead> 
iniervieutr asks a different subqucstioii which he makes spontane- 
ously, the responses are no longer responses to the original question 
but svill sary from interviewer to inicrviesver dcj)ending upon the 
snbqucstion svhich was askctl. This defeats the objective of stand- 
anlizaiion. 

Specifically, the interviewer needs tcchnitpics to handle the 
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following types of situations (I) to elicit additional information 
from the respondent when further information is necessary to the 
research objectne, and (2) to clarify or make more speafic informa 
lion which the respondent has already given All of this must be 
accomplished without changing or biasing the data 

The "probing* techniques useful for such purposes can be 
classified generally as "nondirective ’ They enable the interviewer 
to act as a catalyst— that is, to bring about a reaction without him 
self becoming part of the reaction The effect of such probing is to 
increase the intensity or 'response getting power of the stimulus 
question without changing its content or structure 

Thus, to gam more information the interviewer uses such 
phrases as ' Would you tell me some more about that? I’m inter 
ested in what you re saying Could you give me a little more about 
that? ’ or "I see what you mean Can you tell me a little bit more 
about how you feel there?* These statements indicate that the 
interviewer is interested, understands what the respondent is saying, 
and IS making a direct bid for more information To accomplish 
the second task that of clarification of information already given, 
the interviewer might use such probes as Now let me see if I have 
It straight As I get it, you feel ’* and then summarize what the 
respondent has said Or he might say, I would like to read my notes 
back to )ou to see whether I have your point of view straight 
It is through the use of such nondirective probes that the 
interviewer does much to develop the permissiveness and warmth 
which are so important in the interview The reader who is familiar 
with the literature on client centered counseling will recall the 
stress which is placed on the atmosphere of permissiveness as the 
basis for permitting the client to examine his own attitudes Such 
an atmosphere permits ilic client to verbilizc the deeper attitudes 
which are usually concealed from outsiders Many of the same 
d)naniics are present at a more superficial level in the researcli 
interview, whether one is dealing with personal altitudes or factual 
data Let us consider some examples of iJic cfTccl of skiilful probing 

I IIou do )(iii feet ilmiit iiioncv and hrl[> t > odicr 

countries? 

U Well 1 don t ktioH Sotnetinics I think uc go u>o fir 
I 1 see Can ^ou tell me \ liulc lu >Te diom whai sou line in 
nund 
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R- Well, maybe we ought to give sSme help but my gosh when 1 
■ see our tax money go to help some ot those countries who 
aren't doing much tor themselves, I think sometimes wed 
better lay off. 

I: Sometimes you feel we ought not to help them. 

R; That's tightl 1 think we'd better let them go their own way, 
and to hell with theml 

In this example, the respondent made a mildly critical state- 
ment at first. The interviewer reacted to this by being nonevaluative 
and yet accepting. He didn’t criticize the respondent nor did he 
agree with him; he merely indicated a general acceptance of the 
statement. The result of this was a somewhat more pointedly critical 
statement. The nonevaluative acceptance by the interviewer per- 
mitted the respondent to make his final, bitter response without 
feeling the need to defend or modify it. 

The next example is taken from an interview with a farmer. 
The interview was concerned almost exclusively with problems of 
farm production. 

I: How many bushels of wheat did you harvest last year? 

R: My gosh, we had a terrible yearl When we’d ought to be 
planting last spring, it rained all the time and then It got dry 
and everything burned up. We didn’t gel more than SOO 
bushell 

I: 1 sec. Well, you said you didn’t get more than 300 bushels. 

Can you give me a little closer estimate? 

R? Well, like I said, it was an awful year around here, but I 
guess we got a little more than 300— between 350 and 400 I 
gueu actually. 

I; 350 to 400 you say. Which would be closest? 

R: Oh, I think we estimated it at right around 400 bushels. 

Notice that the interviewer again began with a nonevaluative 
statement, essentially repeating the respondent’s first estimate. The 
effect was that the respondent revised his estimate of wheat yield. 
It seems apparent that his first response was more concerned with 
the misfortunes of the crop than with a precise estimate of it. The 
interviewer ignored the attitude and focused on the factual pan 
of the response. The result was that the estimate of 800 bushels. 
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having served to make the point, was dropped. The final answer of 
400 bushels was almost certainly closer to the actual fact. 

As indicated earlier in this chapter, the general effect of this 
type of interpersonal relationship.is pleasurable for the respondent- 
in that he has the opportunity to talk with a skillful interviewer. 
He reacts to the permissive, accepting atmosphere by communicating 
willingly with the interviewer. 

Recording the Responses 

One final job remains for the interviewer; this is to get a faith- 
ful and accurate report of the responses. Experience has shown 
that the only accurate way to reproduce the responses is to record 
them during the time of the interview, either by mechanical meth- 
ods or by having the interviewer take extensive notes. A good deal 
of relevant information is almost certain to be lost if the recording 
is left until the interview has been completed. It is not within the 
scope of this chapter to discuss various kinds of recording devices. 
Whatever the method, however, the interviewer must be trained 
in its use and it must be carried out faithfully during the process 
of the interview itself. 


SAMPLE INTERVIEW 

In order to demonstrate some of the techniques which have 
been discussed in this chapter, we have included a brief sample 
of a data-collecting interview taken in an industrial plant manu- 
facturing tractors. The respondent was a foreman. The example is 
an excerpt from a phonographtcally recorded interview. It has been 
edited slightly in places Co make it more intelligible. The inter- 
viewer’s questions preceded by an asterisk are questionnaire items. 
All others are the interviewer’s probe. This example is selected not 
as an ideal interview but merely to demonstrate the techniques used 
by one experienced interviewer, 

1. •What do you do on your job? 

i. The objective is to obtain a general pic- 
ture of the type of work and responsibilities. 
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The question was asked word /or luoid os d 
appeared on the questionnaire. 

H; Well, I'm a track foreman, that is, I'm in 
charge of the men who arc putting together 
these metal tracks, you know, that tractors 
run on. 

2. 1: The tractors run on, you say? 

2. Unfortunately a transcription does not 
show inflections and emphasis. In this case the 
interviewer’s question had a slight rising inflec- 
tion on the end, indicating a mild ‘7 don’t quite 
get you" probe. This probe does nof really lead 
in the direction of the objective of the question, 
hut it does give the interviewer a better back- 
ground for further answers by giving him 
information about the respondent’s work. 

R: Yes, these big heavy tractors run on a 
steel track like a tank and, uh, after the irac* 
tor's been assembled, we've gotta hang one 
of those heavy steel tracks on each side of 
them. 

3. Msec. 

3. These brief, permissive, encouraging com- 
ments appear frequenty throughout the inter- 
view. This type of comment and nodding of the 
head, indicating and encouraging comments, are 
the most frequent "technique" which the inter- 
viewer sues. In this recording many are lost in 
the reproduction. 

R: I m in charge of the crew that does that. 

4. /: Well, will you tell me a little bit more 
about your job— you say you're in charge of 
the crew, just what kind of things do you do? 

“f. This serves to bring the respondent back 
into the area of the question objective. The reader 
will note the brief summary of "the pertinent in- 
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formation already obtained and then a viitual 
verbatim repeat of the original question 
R Well, the track comes down the assembly 
line and the tractor comes down the assem 
bly line and the last thing that we do to it is 
to put these steel tracks on it so that it can 
be driven away and, uh, we ve got an electric 
hoist that lifts the heavy tracks and puts 'cm 
in, we jockey 'em into place and then, iih, 
some of tlie men work on the top of the track 
fastening cleats to it, and some of the men 
work on the bottom and, uh, I kind of look 
to see that they re doin' it all right and help 
'em out if there s any trouble keep 

track of our production 

5 / So, one of )our jobs as foreman is to sec 
that the men are doing their work properly 
as you indicate and, uh, keep track of pro 
duction 

$ This illustrates the use of a content sum 
mary as a probing technique The inlerviexoer 
merely summed up the statements which ivere 
madCf This device is particularly effective after a 
lambling, incoherent statement The summary 
serves to focus attention on the central content 
of what has been said In addition, it indicates to 
the respondent that he has been communicatino^ 
ideas and that the intemiexijer has accepted the 
ideas Usually, the summary stimulates further 
responses, either additional duta or clniification 
of what has bnn repaitrd pteiiously 
R ^es, that s right (Pause) and then c%cr) 
day they send me a report of what our i>uik 
was like the day before, how mudi uurk wc 
got out and liow much scrap there was, and 
uh— It's up to me to see that the amount of 
work IS okay and that there is not too mucli 
scrap 
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6. I: I see, uh, are there any other kinds ol 
things you do on your job? 

6. This is a very direct bid for additional 
information. The stress on the word '‘kinds*’ is 
directive in that it asks the respondent to shift 
his frame of reference. 

R: Oh, I, uh, I take care of things like time 
off and, uh, figuring out, uh, uh, when the 
men can go on vacation without busting up 
the work schedule, and, uh, if, uh, a man 
has been around for a while and Is about 
ready for a pay raise, uh, higher pay rate on 
the job he’s doing, it's up to me to recom- 
mend ’em, sometimes if they want to pro- 
mote a man to a higher job, uh, that’s on 
my recommendation. 

7. I: Uh hum, uh— ‘How long have you been 
on this job? 

JJ; Oh, a coupla’ years. 

8. I: You’ve been on this job a couple of years? 

8. This restatement of the response 6»ings 
additional specificity from the respondent. 

R: Well, not quite so long— uh— let’s see, I 
came on this job after Joe left, and that was 
a year ago Christmas time— it’s about a year 
and a half, really. 

i’ Year and a half, I see. *What were you 
doing before that? 
il; I was laying track. 

JO. /* That was for the same company, you 
mean? 

10. Part of the objective of the question in 9 
is to determine whether the person’s prior job 
was in the same company. Here the interviewer 
uses a direct question to ascertain this informa- 
tion. 

ii: Yup, right here. 
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11. /; I see. Well, that gives me an idea of what 
your job is and how long you've been on the 
job. •Now tell me, how do you feel about 
the job you have now? 

J2. This illustrates a brief transition state- 
ment. The interviewer by his remarks indicates 
that one area of the questionnaire is completed 
and another is to be introduced. This is a useful 
technique to help the respondent shift his frame 
of reference to the new topic. 

R: Well, it’s better than when I was on the 
track line. 

12. /; How's that? 

J2. Another probe which is used when clari- 
fication is wanted. The inflection indicates 'Tm 
not quite sure I understand you. Will you amplify 
that remarKf'* 

i2; Well, for one thing, foreman’s pay is 
higher and also it's more regular— it goes 
right on, it's not hourly. 

13. /; Uh hum. 

R: And, uh, besides that, 1 like the kind of 
work belter. 

H. 7; Uh hum— okay, you say you like the job 
better than the one you had before that, uh, 

Jet*5 take it all-in-aU, how do you feel about 
your job? 

14. Up to this point the respondent has 
been answering in terms of details on his fob. 
The objective calls for general affect. The inter- 
viewer tries to communicate this frame of refer- 
ence in this way of re-asking the question stressing 
the over-all aspects. 

R: Well, I guess I like it all right— it's got 
its headaches like all good jobs do, I guess. 

15. 7; Well, dial's one thing uc want to talk 
about, uh, you’ve gisen me some information 
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on things you like about the job, but I was 
going to ask you, uh— •What are some of the 
things you like best about yoar job you have 
now? 

15. One of the problems of an interviewer 
is how to ask a question when the respondent has 
given a partial anstaer under other questions. 
This illustrates a technique for handling the 
problem. The interviewer recognized that the 
respondent had mentioned something on the 
topic earlier; then he proceeded to ask the ques- 
tion. This avoids the implication that the inter- 
viewer was not attending to the earlier discussion 
and serves to get new material. 

R: Well, I think, uh, the thing 1 like best, 
like I was saying before, is the higher pay, 
and, uh, uh, the security of getting a job in 
management. 

16. /: Uh hum— you mentioned higher pay and 
the security of the job— are there any other 
kinds of things you think of there? 

16. This type of probe was discussed earlier 
—the summaiy of coiwersation, then the request 
for other responses. 

R: Well. I guess you could say I like the 
supervisor’s kind of work. 

17. 1: Uh hum. 

R: It gives you a cliance to work with the 
men and at the same time . . . (interruption). 

18. I: You say you like the supervisory kind of 
work. Could you tell me a little bit more 
about what you have in mind there? 

18. The interviewer felt that the respondent 
had not given enough information on the super- 
visory work. Hence the probe, which directed the 
respondent to this topic. However, the inter- 
viewer did not allow the respondent to exhaust 
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his altitudes on the foreman’s role He broke into 
the discussion and returned to a topic which the 
respondent had mentioned earlier This was un 
fortunate, since he lost material which might 
have been very relevant on working with the 
men» 

R Well, ^vhat I mean is working with 
people— now, uh, I remember some of tlic 
things that used to seem good or bad to me 
when I was on the track line, and this way 
being foreman, I get a chance to try to make 
the setup a little better for the rest of the 
guys 

19 7 You have a chance to help the men some 

19 This IS a statement rather than a ques 
tion It serves to summarize responses and en 
courage further responses 

R Yeah, I remember how it was when I was 
on the line, and 1 think I can make things 
better 

20 I Such as 

20 This probe is the same as "What were 
you thinking of here^ ’ 

R Well, like making it handy to get tools 
they need and arranging the work so they 
don t have to work haicl sometimes and loaf 
others things like that 

21/1 see Well now, lets take the other side 
of the picture for a minute, uli— *What are 
some of the things tint you don c hke about 
your job? 

R Oh, I don’t reall) know what to say to 
that 

22 / Uh hum 

R I, uh, 1 don t like to complain, you 
know, they'\e been pretty good to me here 
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23. /: Sure, 1 understand that, uh, what we were 
thinking of, uh, that uh, on most jobs a per- 
son has, there are some things that he may 
not like as well as others, some things that 
he may actually dislike. We are trying to get 
a general picture of, uh, what some of the 
things are that aren't so good. 

23. This probe follows the resistance which 
the respondent showed in his previous remark. 
H is a general restructuring and is supportive in 
that it recognized the respondent's reluctance to 
be critical and attempts to make it acceptable for 
him to give negative statements. 

R: Well, one of the things you might say 
lhats bad about this job is the condition we 
get the parts in. 


24. /; What, uh, what's that? 

fl: Well, what I mean is, these steel cleats 
that we have to put on the tracks, we bolt 
them on, and there's another section further 
up the line where they’re supposed to drill 
ho es in the right places for us to slip the 
bolts in, and half the time they do such a 
sloppy job there that when we try to put the 
bolts in place, we find they don’t lit and we 
have to spend time teaming out the hole, 
an w en we do that, we slow down on pro- 
uction and then the general foreman comes 
around and chews me out. 


25. I: Uh hum. 


R: And, uh, 
things tied 
happen. We 
over. 


it seems to me that if they got 
together better, that wouldn’t 
don’t have a chance to talk that 


26. /. I see, uh, you’ve, uh, you’ve mentioned 
one t ing, that the, uh, way you get the parts 
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IS, uh, handicapping you in )our s^ork Uh 
what else do you think o£ m this respect? 

26 This was a poorly limed probe The 
respondent was talking about his supervisory 
problems at a level which would give real insight 
into his foreman’s role and its problems The 
interviewer chose to focus the respondent on a 

~ new area rather than to follow up on the basic 
problems 

R Oh, I don't think there s anything else 
1 really ought to mention 

27 I I’m interested in what you have in mind 
there 

27 Here again the respondent shows some 
resistance In this case the interviewer merely 
asks him to talk about these resistances In re 
sponse, the respondent mentions a problem area 
This was probably a more effective technique 
than making a more supportive remark 

R Well, they have awful tight production 
schedules here 

28 I Uh hum, and that aflfects you 

28 This follows up the previous comment 
Here the intermewer recognizes the attitudes im 
plied by the response It is a statement rather 
than a question 

R The, uh general forernan holds oxa sec 
tion responsible for getting out a certain 
amount of track each day It seems as if he 
doesn't know anything but just 100 percent 
all the time 

29 I This causes you some problems, I gather 

29 This again recognizes the attitude tin 
derlying the remark Notice that the response ts 
tn terms of the attitude rather than content The 
real attitude comes out in the response after 
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Question 30 Those leados familiar with the 
principle of client centered therapy will recog 
nize the technique This is the first place in the 
interview where there w real emotional content 
to the responses The lecognition of that emo 
tional content helps to bring the interview to a 
level of discussing those attitudes rather than 
conversation at the symptom level 
R You re darned right it does-especialiy, 
like I was saying, when the parts we get 
aren’t quite right 

30 1 Yes. I see 

R It seems to me that people higher up in 
management ought to find out more about 
how things are for usl 


31 / You feel it would be an easier job for you 
If people higher up knew more about your 
job? ' 

3 1 Again this brief summary helps the gen 

tj -v u permissive atmosphere 

Ji Yeah, they don t come around to find out 
how things are really going-u s just 1 00 per- 
cent production, or elsci 

, ^ Well, let 5 turn to something 

s have a shop 

steward m your section? ^ 

32 By asking tins ijuastion at this time, the 
interviewer closes off a friiitfiil area of attitudes 
II IS interesting to note that at this point the 
respondent shows negative reactions He fads to 
un erstand the intervieu.er’s question and qtiib 
hies aver words This may well represent Ins 

R Yni, m ’’‘^^rnlment at being dosed off 
ss Vou mean, uh union? 

33 I Yes, I mean a union shop steward 

"Rhis clarifies the question It does not 
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change the question but merely defines what is 
meant. 

R: Yeah, we have 'cm. 

34. I: Well, how, uh, ’How do you get along 
with the shop steward’ 

R: Oh, pretty good I guess. I don't quite 
know what )ou mean “get along." 

35. I: Oh, I suppose uhat i»e Jiase in mind 
there is that we are interested in finding out 
how people perform these jobs, and how 
people in union shops who do tfiesc jobs get 
along, when they have to work together like 
this What are your ideas on that? 

li appears that the interviewer was 
caught off balance by the respondent’s question 
He was not sure whether the respondent was re 
sistmg oi merely unclear as to what was wanted. 
He responded in terms of the question objectives 
R' Well, most of the time, wc don’t have 
any trouble with each other. 1 try to take 
care of my job and he lakes care of his. Of 
course, sometimes there arc ihlfcrcnccs that 
have to be settled. 

36. /; How do you handle dilTcrcnces when they 
come out? 

36. This IS a direct questron which is in hue 
with the objective of (he quesfiorrnarre item 
It: Well, like suppose a m.an /igiucs his job’s 
timed too tight. He can nitniion it to me 
direcilv if he vsanis, or he c.m lake it to the 
shop sttuard. Sow, if the steward gets it, lie 
(an roine around and well ju>t t.ilk it over, 
ur if he wants to he nasiv ahoiit it. he can 
flit it as .1 foirnal grievance. 

37. I: I see. How do jou nsttall*. work out these 
cases in )oiir section’ 
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R: Oh, he usually comes around and tells me 
what he thinks, and if we can work it out 
together, uh, we don’t make a grievance out 
of it— grievances are just tough for the union 
and tough for us. ^ 

38. 1: Uh hum, then in most cases you are able 
to work this out between yourselves. 

R: Ves, he’s reasonable. He's a little stiff- 
necked sometimes on the time-study things. 

Heck, 1 don't do the time study either. I'm 
in the same box he is. 

39. I: Yes, I see. Well this covers about all the 
questions I wanted to ask you there. There 
are, there is just a little information I would 
like to get from each person we interview— 
uh, •About how old are you? 

39. At this point the intervieioer is ready for 
the personal-data information. He restructures bC' 
fore asking the personal data. 

R: I thought that, uh, these were gonna be 
anonymous and nobody was gonna care who 
gave the information. 

40. I: Yes, that’s right, let me tell you a little 
bit about what we get here. As 1 mentioned 
earlier, before we started the interview, uh, 
we don't take people’s names; we aren’t in- 
terested in identifying them at all. "We do, 
however, want to know something about the 
people we talk to because, you see, the older 
people who have been in the company longer 
may feel differently from the people who 
have been here a short time, and the younger 
people may feel differently from those who 
are somewhat older, things of that sort. We 
are not interested in identifying you in any 
way. And so, I have just a few questions of 
this sort. Would you just give me a little 
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general information about you, for example, 
your age— how old are you? 

40 Having met resistance, the interviewer 
goes into more detail tn describing reasons for 
collecting personal data It is common in the 
research interview to get some resistance on these 
questions because of the concern over personal 
identificatton Interviewers usually use the tech 
nique illustrated here They make a short stale 
ment and ask the first question If this brings 
resistance, they give a mbre complete statement 
of purpose 
R Well, I'm 33 

41 7 •How many grades of school did you 
finish ? 

72 Well, I never got a chance for much 
school 

42 7 Uh hum, about how far did you get? 

72 Eighth grade 

43 7 Eighth grade 

72 Had to go to work 

44/1 see Now that last one— •About what 
would your total income be this year for 
) ourself and your immediate family? 

72 I don’t see what that’s got to do with it 

45 7 Well, this is another one of these, uh. 

Items I was mentioning to you, people with 
different pay rates and salaries may very well 
feel differently on some or a lot of these 
questions we ask— for example, you rcmeni 
her our discussion about how you felt about 
your pay a while back Well, il may be very 
well that a person getting one ^mounl of pay 
would feci very differently from a person 
getting a different amount This giscs us a 
chance to make a statistical kind of analysis 
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45 Again a statement of purpose was nec 
essary It w rtt this point that the new interviewer 
IS likely to get on the defensive and have the 
respondent refuse to give an answer A calm state 
ment of purpose usually overcomes this This 
respondent showed more resistance on these items 
than IS usual The reason may be that he gave 
considerable cntical information during the in 
terview (Much of this does not show in this 
excerpt ) He would be concerned about the pos 
sibility of having his responses reported to the 
company and identified with him 
R Well, I get seventy three dollars a week 

4G I Seventy three dollars a week 1 see 


SOME PRINCIPLES OF INTERVIEWER TRAINING 

So far, this chapiei has stressed the point that the collection of 
data by means of peisonal interviews is a highly complicated, tcch 
meal job which demands much of the interviewer It is clear tint 
if the techniques described here are to be effective, interviewcis 
need careful training Much of the validity of the obtained dm 
depends upon the skill with which the techniques are applied, winch 
in turn depends upon how well the intcrvietvcr is trained 

This section presents some general training principles which 
have been found to be effective The training program has three 
major emphases The first is to clarify to the trainee the goal of intei 
Slewing In many training programs for other kinds of skills, the goal 
of training is clear and the training need not, therefore, be greatly 
concerned ivith this aspect If, for example, ^e are training a person 
to operate a typeuriter efficiently, it is dear to the trainee that liis 
task involves rapid and accurate manipulation of the typetvriter. 
For the lathe operator, the same goal is evident Tlie goal for the 
interviewer is not saapparent Most people have had some experi 
ence in interviewing whether in the formal sense or not In our 
cseryday li\es we often ask people questions to get information of 
one sort or another To the nesv interviewer, then, tlie goal of train 
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ing seems appaicnt and simple— that is, how to ask questions of 
people and get information from them. What is lacking is an under- 
standing of the characteristics of a good intemew— tliat is, the re- 
quirements that need to be met before an interview can be con 
sidered good. What are the principles of standardization, of validity, 
etc., which we are trying to achieve? 

Describing the goal for the interviewer tan be done partly by 
helping him to understand the total research piocess and the part 
which he has in that process. The interviewer needs to know how 
a study is designed, the general principles of sampling, and how 
the data are to be analyzed. This information will serve as a basis 
for his understanding of the interview in i elation to the total 
research process. These researcli principles establish the basis for 
the interviewer’s job. If this orientation is successful, the new inter- 
viewer now sees what he is trying to accomplish through his train- 
ing, Furthermore, this knowledge provides him with a basic under- 
standing so tliat he secs why he is trained in certain techniques. 

The second training aim is to motivate the interviewer. We have 
implied in the discussion of "goals” that the interviewer has some 
real reason for wanting to reach this goal. It is unwise, however, 
to assume that because the goal has been pointed out, he is highly 
motivated to achieve it. The inicrvicwcr must feel that what he is 
doing is important and significant— he must have an enthusiasm 
for his work. Although this is the usual part of a training function, 
ft fs well to point out chat ft is important to stress motivational 
aspects; for example, pointing out to the new interviewer why the 
study which he is about to undertake is important, what its function 
is, how it will be used, why it is necessary that the data be collected 
accurately, and things of that sort. Another motivational factor 
which is common in interviewers is craftsmanship— that is, satisfac- 
tion with an interview well done, particularly if the situation has 
been a difficult one. Earlier we talked about motivating the respond- 
ent to communicate. It is clear that it is diffiailt for the interviewer 
to motivate the respondent if he himself is not motivated. 

The third aspect is training in interviewing skills, or imparting 
to the interviewer the specific methods and techniques which will 
make him an adequate inter\’icwer. It is the feeling of the writers 
that in many interviewer-training programs too mudi training is 
given in ternts of ‘‘rules*’ and specifics— that is, in terms of “The 
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first thing you should do is this/’ “The second thing you should do 
is that/’ 'JWhen you get a question of this sort, it should be handled 
this way," etc. The interviewer is presented with a long list of 
specific techniques which he is to use, but the specifics fail to add 
up to a general, integrated system or a conceptualization of inter- 
viewing. This conceptualization can usually be developed in dis- 
cussions of the research process and the role that interviewing plays 
in the research process, by showing how the other phases of the 
raearch process are dependent on the interview, and by demon- 
strating how failure to follow thae principles leads to error or 
invalid results. Once this has been established, training on the 
specifics is in order. Methods of skill training have been summarized 
by Bavelas (3) in the following statement: "What appears to be the 
most effective method of training skills is a common-sense one- 
watch others, let us watch you, discuss and evaluate differences, and 
try again." This implies the use of informal group discussion tech- 
niques and practice rather than lectures or dependence upon written 
material. 

One way of giving experience is to have the interviewer conduct 
actual interviews. This, however, has the disadvantage that the 
trainer has only a second-hand report of what took place during 
the interview, since he was not present at that time. The ideal 
method is one in which an actual interview can be conducted in the 
presence of the trainees for all to observe and for all to discuss. One 
technique which fulfills this objective and has other advantages is 
I at of role playing” or "reality practice.” It has been adapted to 
* technique for training in behavioral skills, primarily 
skills involving interpersonal relationships. 

In using role playing, one member of the group plays the part 
o t e respondent, identifying himself with some person he knows 
an responding to the interviewer in terms of the role which he is 
Another person plays the role of the interviewer. The rest 
o e group act as observers. When the role-playing session ends, 
^ discussion of the techniques that the interviewer 

us . any times the trainee gets as much out of playing the role 
^ ^^pondent as he does out of playing the role of the inter- 
viewer. y p aying the role of the respondent, the "respondent” can 
perceive w ere the interviewer failed to get information and where 
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the interviewer used techniques which were irritating or embarrass- 
ing. By analyzing his own reactions to being interviewed as he 
experiences the effects of the interviewer’s tecliniques, he can become 
sensitized to the reactiom of respondents. The trainees who are 
observing have a cliance to see the performance and eliminate errors 
in their own interview techniques. Barron (4) makes this statement 
about the use of role playing as a device for training interviewers: 

The use of roIe>playtng or reality practice Is being inaeas- 
ingly recognized as an effective means of translating the prin- 
ciples into methods, of learning the “how,” of getting the feel 
of doing somethitig in a situation where one is not playing for 
keeps. In training which is directed toward improving skill in 
interpersonal relations, It U offered at an effective way of bridging 
the gap between the formal study of prindples, methods and 
techniques on the verbal level and actual work with these 
methods and techniques. It offers an opportunity for practice in 
the kind of work like interviewing, where dose supervision and 
training on the job are very important. 

In addition to the use of role playing as a device for trans- 
mitting skills, the use of phonograph recordings which illustrate 
various aspects of the interview and typical examples of interviews 
are very useful. They serve to point out to the interviewer what an 
actual interview -sounds like and how an experienced interviewer 
handles a specific situation. They serve, too, as a basis for general 
discussion of interviewing methods. 

TEftttet ViOfW eSetiive ibe VraiTivng, n n -aTiTea^ttdt lo expetl 
the original training to make finisbed interviewers or that it will 
be equally effective with all interviewers. One of the essentials of 
training is that further training be conducted periodically as the 
interviewers proceed in their v/orfc. As the interviewers grow more 
proficient, they become more interested and more involved in the 
fine details of the interviewing process. They want to discuss specific 
types of probes, the motivation of difficult respondents, etc. In such 
sessions, role playing is a valuable technique. It permits an inter- 
viewer who has a problem to act the role of the respondent and 
thus portray the difficulties which he is having. The difficulties 
can usually be ironed out through the role-playing session, which 

\ 
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also provides the opportunity for the rest of the group to profit from 
his experience and to learn along with him. 


SUMMARY 

The purpose of the present chapter has been to discuss the 
technique of the research interview, to offer a rationale or theoretical 
framework for the technique described, and to put the interview 
in perspective as one of the various devices for data collection which 
are at the disposal of science. 

We began with the postulate that scientific progress depends 
importantly upon the systematic collection of data and that this 
involves (1) a statement of specific research objectives, (2) definition 
of the data required to meet such objectives, (3) determination of 
the population from which these data can be obtained. (4) selection 
or development of techniques adequate to evoke the data. We have 
attempted to demonstrate that the interview can approximate 
these criteria in social research, and that the interview is especially 
adapted to the collection of data about attitudes and perceptions, 
beliefs, feelings.-past experiences, and future intentions. 

The problem of respondent motivation was discussed in terms 
of two major motivational sources: (1) the respondent’s perception 
that by participating in the interview he may help to achieve some 
goal or bring about some change which he considers desirable, and 
(2) the direct gratification or catharsis which the respondent realizes 
by speaking to a person who is understanding and accepting of his 
opinions and ideas. ' 

Questionnaire design was presented as the task of creating an 
instrument which would serve to translate the research objectives 
wit out bias into terras understandable to the respondent and 
wou , at the same time, assist rather than retard the interviewer 
in motivating the respondent to communicate. The specific aspects 
of questionnaire construction were also presented, including lan- 
guage, frame of reference, information level, social acceptance, word- 
ing, and question sequence. 

The specific techniques whicli the interviewer must employ to 
evoke complete and frank responses were reviewed, with especial 
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attention to rapport building, ‘probing,* and recording of re 
sponses These techniques are also illustrated m the sample inter 
view and accompanying commentary 
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CHAPTER NINE 


Observation of Group 
Behavior 


Roger W. Heyns and Alvin F. Zander 


Within the past few years there has been a great increase 
in the use of observation methods in the study of social phenomena. 
These experiences have indicated that direct observation of social 
behavior can provide reliable and conceptually meaningful data in 
Beld studies as well as in laboratory experimentation. 

This increase in the use of observers has been accompanied 
by an increase in methodological sophistication in observation 
methods. There is a growing atvareness, for example, of certain 
typical problems in. the development of observation schedules, 
the training of observers, and the achievement of reliability. Many 
of these problems have not yet been subjected to methodological 
research, but there is a good deal of "wisdom" in these areas which, 
until the necessary research is done, will help the investigator to 
avoid some of the more common pitfalls. 

This chapter will deal with two principal types of observation 
instruments: category systems and racing scales. We shall discuss the 
finished products of both types and the problems involved in their 
development. To provide a focus for this discussion, we shall begin 
by (^scribing an observer team in an actual situation. 
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AN OBSERVER TEAM IN ACTION 
The Setting 

Let us suppose that an observer team of two Is studying methods 
of problem-solving in groups. Let us suppose, further, that this 
phase of the investigation involves the observation of a large num- 
ber of groups in the field. This means that the observations are 
being done in a relatively uncontrolled situation. The team can 
control neither the kinds of variables which will be present nor the 
behavioral responses to them. The specific duties which have been 
assigned to the members of this team, as well as the way in which 
they are to perform them, have been dictated by the study design 
and the theoretical framework which have been developed. 


The Entrance and Behavior of the Obseweis 

^s®n trained to make Uiemselves as nm- 
‘'•'y >"=<'« '“--e Ihat the group 

is eenerahv '*'*'* ptesent and that the group 

inn Tho, ^ observers’ purposes in attending the meet- 

to the at the meeting place early, introduce themselves 

concerntn answer any additional questions he may have 

■ Their m ^ purpose and the way in which they will function. 
undTrsmnd"“ “"'a"'' *0 g.oup is positive, 

of thdr " I>'«h *eir "style” and the content 

member? tiuestions are iniended lo make the group 

-rerThan : " objectivf faci; 


rather tTv-aav s 'juscjvcrs are present lo record objective facts 


; ••>v4aviuuaiS. 

exnlain? th ^ begins, one niembei of the observer team 

they stressing the fact that 

SnX -n but rather 

suiro^ests that th^^ ' groups go about solving their problems. He 
not° involve thenT^^^^k'^* disregard the observers and 

nature of wnrv ?». discussion. He also describes briefly the 

The ob?Prv doing while the meeting is going on. 

from inte f ' ^ especial precaution to keep their activities 

from interfering with the meeting. They sit as far asvay from the 
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group as space permits They avoid conversation with each other 
and any other communication which might re\eal their attitude 
toward what is occurring in the meeting unless there is hughter, 
applause, or some other reaction which they can appropriately 
make without appearing to differ from the dominant climate in the 
room at the time 


The Data Obtained by This Team 

Observer A codes the problemsolving process of this gioup 
observer B records the content of the meeting To test the hypoth 
eses developed for this study, the problem solving observer uses a 
prepared category system * He is responsible for coding each relevant 
contribution of each member into one of a number of categories 
He also records ivhich member made the contribution and to whom 
It was addressed These categories are listed on a standardized form 
which facilitates rapid recording of these data This observer watches 
the group interaction m terms of the categories which are listed 
below With their definitions By the time the observer is on th’ job, 
he has memorized these definitions, of course, and hence they are 
not repeated on his observation form 

Problem solving Categories* 

Goal setting These contnbutiom ha\e the function of establish 
mg or su^esiing goals or objectives both procedural and con 
tent They are concerned with ends to be attained These 
objeaives goals or ends may be those of the individual which 
he IS trying to have the group attain they may consist of state 
ments of accepted goals of the group or part of the group 
Problem proposing These comribulionj s-rve the function of 
presenting a problem ciUicr in content or in procedure They 
are concerned with means lo ends or goals 
Information seeking Tliesc contributions have the function of 
seeking to obtain information of an objective factual or 
technical nature 1 he information sought is from the aiea of 

» At some point In the data collection proce« an add ilonal obiener can 
be introduced to provide a fehabUiiy check for d t coilmg 

STliU let of caiegorio and other examples In Ihli JlJuuntlon are talen 
from the procedures iiictl hy the Conf Knee Revearth Projecl l/nhenliy of 
Michl^n (B) * 
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' another has contributed, whether approval of content is present 
or not. This is a supporting comment in procedure 
Summary seeking These contributions ask, m effect for a sum 
ary— e g , ' I’m lost, where are we now^ 

Summary-giving These contributions summarize the gzoup’s 
progress to date They refer either to substantive material dis- 
cussed over a period of time, to conclusions reached, or to the 
group’s procedure over a period of time Summary statements 
of individual participations are not included here 
Non problem directing This category includes irrelevancies of 
the tangential sort and a mynad of responses of an inter- 
personal sort, such as ‘ Gne me the ash tray, and How about 
opening a window? It includes statements whidi have no 
reference to the subject matter of the conference or to the 
group procedure 

While observer A notes problemsolving contnbutions in the 
categories above, observer B records the nature of the meeting 
content He keeps a running account of the actual subject matter 
discussed in accordance with the following 

1 Notation of each topic discussed 

2 Classification of each topic into one of the following 

a Procedural a topic having to do with the procedures or 
process of the group 

b Substantive a topic having to do with the subject matter 
of the meeting 

3 Notation of the nature of the task confronting the group at 
each point on its agenda, using the following categories 

a To arrive at a decision 
b To approve a deasion already made 
c To receive or give information 

4 Observation of what actually happens to each item on the 
agenda by noting whether the agenda item was 

a Completed 
b Postponed 
c Left uncomp/cicd 

For each of the tasks above the observer has been trained to 
follow carefully stated definitions in an ‘Observer’s Manual” 
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fact on which the group derision 1$ to be based or which has 
bearing on the decision. Contributions seeking factual, objec* 
tive, or technical information concerning the procedure of the 
group or an individual are classiRed here. 

Informatiori’gwing: These contributions have the function of 
providing objective, factual, or technical informaliop, either 
in the subject area or with respect to procedure. The category 
includes the citing of examples or illustrations. 
Solution-proposing: These contributions serve the function of 
indicating solutions to problems. They are suggested means to 
ends. Modifications of or additions to solution proposals pre- 
viously offered are classified in this category if the context gives 
the contribution a solution-proposing function. 
Development-seeking: These contributions serve the function of 
attempting to obtain clarification of previous contributiohs. 
They seek to detennine what was intended by a previous con- 
tribution, what its implications arc, what inferences are per- 
missible. These frequently take the form of an inference stated 
as a question. 

Also included here are contributions which facilitate the 
procedure of the group by asking the group a, a whole, or 
individuals, to comment, indications to individuals that they 
have the floor, etc. 


D^efopment-gWog, Contributions here elaborate, make explicit, 
enla^ on contributions. Included here are inferences from 
previous contributions; self-repetition, or restatements by other, 
whlcr'"" a- reflecting types of contributions 

tnr, .c ations of previous contributions without intent 
of Wh “hon but which are,' rathei', declarative statements 
thl. . “ntribution staled or implied. Finally. 

includes contributions which provide the rationale, 
. . ' ^fflnments for the individual's positions. They give 

his reason for hi, saying what he does. 

“ntributions are characterized by an opposi- 
tion tfisrgreement with a suggestion, solu- 

Jjffil,,,- etc. Responses which point out obstacles, 

Mculties, or objection, are included hefe. 

“"“‘‘’"'ions serve the function of indicating 
Tnrlt ® suggestion or solution proposal. 

are indications of approval of the fact that 
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4. Urgency: How urgent did ihe group regard these problems? 

0 1 2 3 4 5 6 7 8 9 10 

No Urgency Very pressing 

(Note: This is a judgment of the extent to which the group 
felt it was necessary to arrive at a decision at this meeting.) 

5. Importance: How important to their organization were these 
problems regarded by the group? 

0123456789 10 

Of liHle The very life of 

consequence the organtzafion 

depends upon if 

6. How formal were the interrclaiionslu'ps among the people in 
the group? 

0 1 23 - 456789 10 

Extremely Predominantly Lorgely Completely 

formal formal informol informol 

(Note: Rate on basis of mode of address, number of personal 
comments, and number of asides used by the individuals 
indicating social distance among them.) 

7. How supportisc and accepting was the group of its members? 

0 t23456rB9 10 

The group was The group was 

highly critical permissive ond 

end punishing highly receptive 

8. Howplcasant was the affccthc interpersonal atmosphere of 
the group? 

0123456789 10 


Very unpleosant; 
quarrelsome, critical 
end unfriendly 


Very pfeoiont; 
personobie, warm, 
and enioyobte 
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When the meeting is finished* each observer is required to rate 
certain aspects of the meeting particularly the areas of communica 
tion, motivation, and interpersonal relations Each works inde 
pendently, using the following scales 

1 Underslandabiltty To what extent were the participants 
getting the meaning of one another s statements? 

0123456789 10 


They were talking Communicated 

past one another directly with 

there was much one another 

misunderstanding 

(Note This rating should include not only trouble with par 
ticular words but also more general conceptual processes 
such as difficulties m level of concreteness style of expression 
etc) 


2 Opportunity lo Communicate To what extent did the par 
ticipants have opportunity to talk? 

^^^3456789 10 

Never hod Seldom had Usuolly had Had every 

opportunity opportunity opportunity opportunity 

talk to talk to talk to talk 

(Note In some groups this can be judged by the number of 
times persons seemed eager to get the floor but could not and 
the number of simultaneous participations In other groups 
the participants have already leirned not lo try to talk be 
Muse of the dominance of a few members hence although 
mgs need saying the participants ha\e little opportunity 
to say them ) 


gouivolverient Hosv much did the members have at stake 
in the problem outcomes? 



Nothing to 
gain or lose 


Much lo gam 
or lose 
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behavior may be coded; a category system consists of two or more 
categories. A carefully developed category system provides a com- 
mon frame of reference for observers and increases the likelihood 
that the relevant aspects of the total behavior will be noted with 
reliability. 

The exact nature of the cat^ory system— i.e., its characteristics 
with respect to the number of categories, the level of conceptualiza- 
tion they involve, the applicability of the set to a wide variety of 
situations, etc— depends upon the purposes of the investigator and 
the theoretical framework within which he is working. Although 
it is possible to distinguish various types of category systems along 
a large number of dimensions, several dimensions seem to us espe- 
cially useful for understanding the kinds which have been used and 
the types of data which can be obtained by their use. 

THE DIMENSION OF EXHAUSTIVENESS. Some Category systems are 
developed in such a way that ail the behavior observable can be 
classified into one of the categories in the seL Bales’ interaction 
categories make up such a system (2). The 12 categories in his system 
have been developed with the objective that all verbal behavior in a 
small face-to-face group be codable into one or another of them. 

A contrasting system is that used by Jack in her study of ascendance 
and submission in a play situation (9). Her set of categories focused 
on forms of ascendant behavior; l^avior not in these categories 
was not coded. In a sense, of course, all less-than-exhaustive category 
systems are really exhaustive, sina behavior not codable into one 
of the categories is implicitly in a category labeled “not in the 
system.” This distinction is a real one and consists essentially of 
determining how much of the total observable behavior is to be 
classified into a defined category. 

The question of whether or not a category system ought to 
he exhaustive (in this sense) must be decided by the experimenter 
in the light of his purposes. Two considerations seem worth pointing 
out. however. The first is that much time can be saved in analysu 
if the instrument contains only those categories which are necessary. 
The serond consideration concerns the nature of the behavior not 
<=»*egori 2 ed when a Icss-than-exhaustis'C category system is used. E\’tn 
‘hough no further discrimination is necessary than “it is not m die 
It may nevertheless be important to know the t^l arnoum 
behavior in this residual, undiffcrenliated category. Thu can be 
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Below some o£ these scales there are instructions to raters con- 
cerning the cues which are important and the extent to which 
certain factors should be weighted in arriving at a final rating. More 
detailed instructions of this sort for each item are again provided 
in an “Observer’s Manual.’* 

Each observer also writes anecdotes describing incidents or 
conditions which he thinks might be injporiant. Specifically, he is 
asked to note factors which have a bearing on the observations 
obtained (their validity, interpretation), factors which should be 
included in subsequent observations, and the presence of conditions 
which might make the group unsuitable for inclusion in the final 
analysis of data. 

This brief description of the steps a typical observer team might 
follow has been placed in a field setting. The observation process 
in a laboratory is not very different, however. The experimenter 
would probably introduce the observers to the persons being ob- 
served and desaibe the activities of the observers. More will be said 
about the introduction of observers and the problems of observer 
decorum in a later section of this chapter. 

• Now that we have described the activities of a hypothetical 
0 server team, we are ready to turn to a discussion of the technical 
features involved in the nature and development of such a system. 


CHARACTERISTICS OF AN OBSERVATIONAL 
SYSTEM 


This seclion will be concerned, first, with the definition and 
escnption of two major types of observation systems: category sys- 
^ V scales. The descriptions present the characteristics 

f" of both types. Later we shall discuss problems 

in the development of these procedures. 


Category Syrterm 

One of the most useful devices to describe qualitative social sit* 
ua ions in quantitative form is that of coding the behavior within 
separate caiegoriw. For the purposes of this section, a category is a 
statement escribing a given class of phenomena into which observed 
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whidi may be described in terms of these concepts. For example, 
when we ask observers to make inferences concerning the motives 
or emotional state of the actor from behavioral acts, we must either 
have empirical evidence that a number of phenotypically different 
behaviors have the same genotypical dimension or that our theory 
about this situation is such that they may have. The availability 
of sophisticated ob5cr\’ers is another factor whicli affects the advis- 
ability of using category systems which involve high-level inference. 
If one’s resources in this respect arc limited, it seems much more 
desirable to define clearly the categories of behavior to be classified 
with a minimum of inference, leaving the inference task to the 
experimenter, 

NUMBER OF DIMENSIONS. Some Category systems are developed 
within a single frame of reference; they require the observers to 
focus on behaviors which are, at one level of conceptualization, 
homogeneous. Others include categories which describe social be- 
havior along a number of dimensions. This distinction can also 
be applied to single categories within a system. An example of a 
category system focusing on a single aspect of group process is that 
developed by Heyns and Berkowitz (8) for the description of the 
problem-solving process in decision-making groups (described at the 
beginning of this chapter). Each participant's contribution is classi- 
fied into one of 12 categories on the basis of the problem-solving 
function performed by that contribution. Other dimensions of group 
interaction, such as the emotional impact of contributions, are 


Ignored or only minimally represented. 

Bales’ (2) interaction category system is an illustration of a cate- 
gory system involving more than one dimension. For example. 
Category 1. "Shows solidarity/' and Category 2. "Shows tension 
release," seem to be descriptions of interaction along affective dimen- 
sions; Categories 5, "Gives opinion/* 6. "Gives orientation,” 4. 
"Gives suggestion," refer to the intellectual problem-solving activity 
of the group. 

There is. again, no pat answer to the number of dimensions 
'vhich should be attempted in a category s>stem. It should be pointed 
oot. however, that the number of dimensions dictates the numoer 
of frames of refeience which the obsers'crs must be aivare of and 
use. A laige number tends to reduce agreement between obsen- 

A further complication in the use of systems involving more 
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done only i£ one has a record of the amount of total behavior winch 
was not specifically categorized 

THE DIMENSION OF INFERENCE Another Way of differentiating 
category systems involves the amount of inference that they require 
of observers Some category systems require that the observers coding 
the behavior proceed from the actual behavior which they noted 
to a deduction about this behavior^ At the other extreme are pro 
cedures which require the observers to place the behavior they saw 
or heard into categories with no requirement of inference Let us 
suppose that an experimenter is specifically interested in the effects 
of strict disciplinary practices in the classroom on the kinds of inter 
action of children on the playground At the simpler extreme, he 
might develop categories such as “Shoves other children," ' Calls 
other children names," “Asks for help ” These behaviors are selected 
either because they represent the level of data the experimenter 
needs to test his hypothesis or because they permit later reclassifica 
tion into categories which are at the appropriate level On the other 
hand, he may use such categories as ' Shows hostility, ’ “Demands 
submission," “Desires support," ‘ Resents dependence ’’ In the latter 
case the observers are looking at the same behavior but are making 
inferences concerning it The material which is coded is based upon 
these inferences 

When some inference is required to test the hypotheses, the 
essential difference between the two category systems is who makes 
the inference In either case, some theoretical system is required 
on the basis of which inferences are made, either by the experimenter 
when he confronts the data presented to him by observers or by 
the observers when they observe the behavior In other words, in 
the first case, the observers note the incidence of behavior in cate 
investigator makes the inferences during the analysis 

0 i e data, m the second case, the experimenter asks the observer 
to make inferences after he has instructed him as to the kinds of 
in crences that can be made This instruction actually consists in 

1 cscri ing the behaviors which can be placed in each catcqoi) 

Once again the decision as to how much infcrcnrc to require 
fioni observers depends in large part on the purposes of the txpcri 
inenter There are, however, a mimber of considtraiions winch 
ilescrsc special mention One of tlies is the degree of conlideiice 
one h IS in the clant) of the concepts 1 5,111 used iri<! the licJi uio 



Observation of Group Behavior 393 


i?n/jng Scales 


Simple rating scales are also often used to record quantified 
observations of a social situation. They may be used to describe the 
behavior of individuals, the activities of an entire group, the changes 
in the situation surrounding them, or many other types of data. 
Rating scales often provide more superficial and less reliable data 
than do well*developed category systems such as those just described. 
However, practical limitations may force one to rely upon this 
method to guide observations. 

By the phrase “simple scale" we mean a scale with a set of points 
which describe varying degrees of the dimension being observed. 
Observation rating scales have seldom been submitted to rigorous 
"scaling" treatment in their development, probably because it is 
often difficult to get a sufficiently large number of trials with any 
one observation s^edule. 

HOW RATING SCALES ARE USED. Rating scalcs are most often 
used in either of two ways: (I) to record behavior at frequent inter- 
vals throughout a sample of social interaction, or (2) to rate the 
nature of the entire social event after it has ended. An example M 
die former type was used by Lappitt and Zander (10) in a fie 
experiment on Scoutmaster training. Observers noted the behavior 
of Scoutmasters in Scout meetings before and after they had r^eiye 
training. One five-point rating scale was used to rate the 
symptoms of group tension shown by the boys during meetings ^ 
by the trainees. The scale was marked whenever the program activity 
or the group atmosphere changed during the meeting. T e escrip 
lion of the scale and the observer instructions was as follows; 


Physical symptoms of group Icnsiou: This 
with the state of tension existing in the group as it is 
through physical syroptoms. It will indirectly measure -h' 
of psycholoScal strain croated in the boys during varm« 

Of a Scout Letlng. The assump.ion is made that 
is physically tense to a noticeable extent .here J 

psy^oiogj, tension. This ^ycMogiro.^^^^^^ 

boys themselvesVlt will be dimenit sometimes .0 know 
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than one dimension is that the categories on the different dimensions 
may not be exclusive. Thus, for example, at one level of analysis 
a question may be coded quite correctly as a request for informa- 
lion At the very same time, it may be coded equally correctly as 
arf expression of hostility. This difficulty can be overcome, of course, 
by permitting multiple coding. This may not be desirable for other 
reasons, however. Finally, when a single category includes behavior 
on more than one dimension, there should be strong theoretical 
or empirical support for the proposition that at another level of 
conceptualization these two dimensions are similar. Without this, 
a category score would have no single meaning. 


Discrete vs. Continuous Categories 

Some category systems are so constructed that the categories 
within them can be placed along a continuum. Others, even though 
they have only one dimension, contain disaete categories, which 
cannot be located in relation to each other on any very clear con> 
tinuum. We can illustrate the first type by referring again to the 
researcher interested in describing the playground behavior of 
children. Let us suppose that he has a number of categories into 
which aggressive behavior can be coded: Category 1, "Mild verbal 
attack," Category 2, "Verbal threats and threatening gestures," 
Category 3, "District physical attack.” According to one classification 
scheme, thb would be a continuous category system, ^nging from 
mild aggressive behavior in Category 1 to severe in Category 3. The 
adequacy of a continuous system rests on the adequacy of the theoret- 
ical framework. In the example given, a threat is assumed to be a 
more severe form of aggression than verbal attack. There might be 
situations, however, in which this was not true. When assumptions 
concerning the ordering of categories are justified in a given situa- 
tion, continuous category systems are very desirable. When devel- 
oped adequately, they constitute a scale (see Chap. 11). 

Discrete category systems contain categories which have no such 
relationship to one another. An example of this type is the problem- 
solving category system referred to earlier. It is not possible to locate 
such categories as supporting, solution-proposing, goal-selling, and 
de\'eloping on a single, clear continuum. 
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signs of restlessness, or a tense, anxious expression. 
A football crowd watching the kick for a point after 
touchdown would be "keyed up." The keyed-up 
behavior may be shown by boys who are in any 
posture or stale of mos’Cinent. During a fast basket- 
ball game, for example, the signs by which it is 
recognized might depend on the pitch and frequency 
of shouts, the fadal gestures, etc. Boys who are re- 
([uired to sit through a tongue lashing may be keyed 
up but their tension might be shnivn in facial 
gestures and inhibited movements. 

Let us ignore for the moment the adequacy of the assumptions 
on which this scale was based and the problem of reliability in 
observation on a rough scale such as this; it is clear that this is an 
attempt to get a measure of the significance of a large variety o 
fiodily movements which are interpreted by the observers as in ira 
live of psychological tension, and that these many movements scat- 
•cred throughout the group are gathered in a relatively 
way. An example of a scale used to record behavior at the end 
a meeting was described earlier in this chapter. Anot 
"■as developed by Fouriesos, Hutt, and Guetrkow (6) and was useu 
“> rate the amount of self-oriented need beliavior aho"n y 
member of the group. This type of behavior is descri e y 
‘tuthors as follows: 

[Self-oriented need beliavior] is not 
inward a group goal or ihe satisfactory solution of he poup 
problems : it is directed primarily to the ^ 

need itself, regardless of the effect on the attammen. of g™ p 
goal. 

, The scale has 11 points, ranging ft^m «, f 1, 

oriented need) to 10 (all behavior of the self-ori 
follows: 

0 No expression of self-oriented nee behavior. 

■ 1 Some digh. indication of self-onented need 

^ . a „eed behavior indicated but no, 

3 Some self-onenicd need prominent. 
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tension is gml directed mxiety or tiUether it is an emotion il 
reaction An inecdotal note on the source o[ tension will be 
valuable il discernible This scale asks lor both sorts of tension 
on the bunch that most of the ratings will concern goal directed 
tension and those which do not a ill be ippirent from the rest 
of the data \ccording to physical signs how tense are the boys 
at this time in the meeting? 

0 position— rnti I ralr 

1 position— I <r)i relaxed The group is physictll) and psycho 

logically taking it easy This does not mt^n that 
they are in a state of rest or repose It means that 
they are carrying on the kinds of activity which occur 
at a Scout meeting without any apparent air of ten 
V Sion This may be a sprawled conversation group or 

It may be a relaxed game Tlie boys seem comfortable 
ilicy act and look the tsny people do Tftcr they rise 
from a good meal 

2 posiiion-/{e/rtxe<i Mark this category if the group is relaxed 

but not as greatly relaxed as in the category abo>c 

3 position— 3/idd/egfoand This category is marked if the boys 

arc acting as most people do most of the time The 
goal has t more positive \alcncc than might be true 
in positions I and 2 1 here is a small amount of 
tension but it is not great enough to be expressed in 
physical signs of tension They may be physically 
active or sitting still Facial expression shows no 
apparent signs of strain 

i position— This point is marked to describe group 
behasiors which indicate psychological tension Boys 
may be trying hard in a signaling contest or a 
written examination Sometimes tension may be ap- 
parent m the purposeless movements by the hoy 
{purposeless in the sense that they seem to base no 
relation to the group goil at the time) These are 
such movements as hand wringing foot twisting 
tongue-chewing drumming ind other nenous 
mannerisms 

^ position— /fejfd uft Here tension is \ery obvious Its physical 
signs are clendted fists or hunched bodies or extreme 
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preferable to record the frequency of each cue as it occurred This 
could be a tremendous job, however, since there could be a large 
number of behavioral cues which might indicate the presence of one 
of the factors listed above If one were interested in a large number 
of factors, the cues could mount to impossible numbers Thus, rating 
scales, which contain a variety of behaviors at each point on the 
scale, are more efficient since they can provide more data per observer 
and more dimensions per unit of time The observer using a rating 
scale IS m the role of a human collating machine He observes a 
number of acts throughout the group, integrates them m his mind, 
and makes a judgment as to which point on a number of scales best 
describes his interpretation of the varied behavior 

Thus, at some points in a meeting when events may be 
nng so rapidly that it is impossible to record the nature of the 
acuvities of each person, it may be easy to record a summaiy sta« 
nient along required dimensions by means of a rating sea e n e 
in some cases, one needs measures of change in behavior on y at 
die point where changes in other conditions occur For ex^p e, 
the rating scale on physical symptoms of tension earher d«aibed 
wis used because it provided a quick ' t 
^irs when the group changed program P 

teaching and learning condiuon to one in sv i ^ raoidlv 
P'aying a game) Many new and important cues "'“1' ®PP P ^ 
under suefa dianged state of affairs (perhaps a mle^e oj 

tension) so that one could not record them all 
on a leisurely tabulation of whatever it P°“'‘’'' ° „ ,dl„ 

nn observer’s physical limitations since the P"^^ 1 ^ 

ohnnge again before an adequate population o 
tubulated*^ In such a case, ratings provide n«f“' — of 
Rating scales can be useful m the «plo« P’ 

nttudy If one is uncerlain -hat the cues am 
hohavior, scales can be useful in defining 
n interested in determining the nature o J” ^ 

used in identifying supporting of his voice 

content of the words said by a P® ' rnirht all reveal dues 
tho bodily postures, and his facial “fwofln Z case, 

j;j;0 the pLence or absence of supporuve behavmr 
huquent ratings, with notauons of the lyjicn they compare 

decision, can be made by several observed When tney 
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.4 

.5 Considerable selE-oriented need beha\ior 
.6 

.7 Almost all behavior of self-onenicd type a grtJi deal 

of expression 


^9 

10 All behavior of the self-oriented need type 

The ratings describe the amount of seU-oriented behavior each 
individual displayed dunng the meeting £nd-of meeting ratings 
such as these can be made on the basis of reflection and recall by 
the observer— the practice most typically followed However, Foune- 
ros, Hutt, and Guetzkow asked their observers to tally instances 
of the expression of five different types of needs dependency, dom 
inance, aggression, status-seeking, and catharsis Whenever any mem 
ber showed behavior which fell into one of these need areas, the 
observer recorded that fact Thus, at the conclusion of the meetings, 
he was able to make subjective, integrating appraisals of his tallies 
in the form of the single overall rating on the scale just described 
There is some evidence that such ratings arc empirically valid 
Use of the above mentioned scale m field studies has shown that 
groups which are high in expressed self oriented need tend to per 
ceive themselves as less unified than do low need groups (6), and 
persons of high prestige who do not have their prestige recognized 
by a group over a senes of meetings tend to increase in the amount 
of self-onented need behavior (14) 

Note, then, that rating scales can also be used to summarize 
integrate data obtained by separate tallies on a category system 
his IS often necessary to aeate a pattern in the data which simple 
requency counts on separate observation categories would not 
provide 

WHY RATING SCALES ARE USED Ratings are used because they 
yield reliable and quantified data within time and personnel hmita 
tions which often exist in studies using observers One may wish to 
have some measure of varied items such as the interest being shown 
by a group, or the amount of hostility, humor, or task-directed 
behavior Assuming that one knew all of the cues which could yield 
an indication that one of these factors was present, it would be 



Observation of Group Behavior 399 


experimenter must make in developing his methodology. This is 
particularly true with respect to selection of the behavior to be 
observed or rated and the definition of the categories into which 
these behaviors are to be placed. Without a knowledge of the pur- 
pose of the experiment or research and the theoretical setting in 
which the experiment takes place, no one can prescribe for an 
experimenter the dimensions with which he ought to be concerned, 
or the amount of inference he ought to require of his observers. e 
experimental design and the specific hypotheses to be tested will 
dictate the level of reliability of observed data which will be neces- 
sary. Many other decision implications of the theoretical frarawor 
could be stated; several of these will be made explicit in the discus- 


sion of decision areas which follows. 

Some of the decision areas to be discu^ed are more pressing 
for rating scales than for category systems or vice versa, flic P™ 

3re, however, sufficiently common to both techniques to 
ibeir discussion in connection with both. 


, T’/iff rrame of Reference 

The frame of reference is involved, in part, in the 
jn connection with the level of inference require j. \ve 

‘n part, in the problems discussed under number o 
«n explicitly restate the problem in this dimension 

interests of reliability, observers must be cle 

fo be observed and about the vantage _,..Lrs may be in- 

observing, recording, coding, or rating it. the 

sttucted, for example, to observe the soci ^hicli the mem- 

dimension of interpersonal affect-i.e-/ the ^ dimen- 

*>5^3 of the group are personally fond of erven are 
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their ratings, and the cues they used they begin to be auaie of the 
behavior they use as indicative of supporting behavior Further 
refinement of the scale, another period of joint observations, and 
comparative discussion make increasingly clear the clues that are 
most indicative of this beha\^or 

Under ideal circumstances rating scales should provide data 
which are strictly comparable to those obtained by the use of a 
category s\stem This ideal is dilficiiU to reach however, since the 
variety of behaviors which one must include within any one rating 
may create problems of relnbility Then, too, a rating scale which 
is discriminating enough for one type of behavior may be too gross 
for another If the scale is so gross that most of the data fall at one 
end of the scale, the user runs into problems m the statistical treat 
ment and interpretation of ilie findings Such problems as these are 
treated in more detail in CInpicis (> and 11, on the construction 
of measuring devices 

What criteria can one use to decide whether one should use 
single categories or rating snles' In general, whether one uses 
frequency counts or ratings depends upon one s resources and the 
demands of the problem Thegreatci the precision required the less 
one IS likely to use riting soles 


PROCEDURES IN DEVELOPING A SYSTEM 
OF OBSERVATION 

Systematic observation of social lieliavior has a relatively short 
iistory Our information is still loo limited to give the lesearchei 
many rules of thumb which he can follow in the construction of lus 
^ tools ^Ve can, however point out the major decisions 

'V ic the experimenter must make discuss some of the consuleta 
lions which must be taken into account and suggest some criteria 
w lie might be used in evaluating various alternatives In fact it is 
c ear to those who have had considerable experience in this area that 
t lere is no single solution to each of the problems the best solution 
IS in terms of the objectives of the stmlv 

The importance of theory in the design of experiments is often 
neglected We need only point out here that the theoretical frame 
work plays a central role in determining the decisions which the 
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behavioral unit is to be taken into account in the coding or rating. 
Few would seriously argue that the context (the situation in which 
the behavior being coded or rated takes place) should be ignored. 
It is rather a problem of what part of the context, how much of it, 
is to be taken into account. In the coding of the problem*solving 
functions, for example, a specific contribution might serve to enlarge 
on or to develop a previous contribution. This earlier contribution 
might have been opposing a still earlier contribution. One coder 
might classify the present contribution as developing, whereas a 
second coder, having the larger context in mind, would classify it as 
opposing. The experience of some workers in this area indicates 
that, if theoretical considerations do not dictate the answer, using 
the most immediate situational context as the frame of reference 
makes for the greatest amount of agreement. Thb is especially true 
if interaction is rapid, 

THE INTENT AS A FRAstE OF REFERENCE. Another ptomincnt 
source of interobserver disagreement is the extent to which observers 
permit their judgment of the intent of the actor to color their ratings 
or categorization of his behavior. This is often a problem when it is 
the explicit purpose of the experimenter to ignore the objectives 
implied in a given statement. An illustration of this difficulty is that 
o the group member who is asking questions concerning the impli* 
cations of a proposed solution to a problem. The observer oriented 
to Ignore intent would classify this behavior as information-seeking, 
^e observer oriented toward taking intent into account might code 
this same questioning behavior as attacking or opposing, basing 
tos judgment on the manner in which the questions were asked. 

. ® exjperiraenier may, however, be interested precisely in 

e intent o an act. When this is the case, two sources of observer 
nre la i ity may occur; (1) observers may disagree concerning the 
nature of the cues to use in identifying the intent of the actor; or (2) 
o servers may disagree in the degree to which they take intent 
in o const erauon. In short, explicit instructions should be given 
o servers concerning the cues to use for identifying intent and the 
ex ent to which judgments concerning intent or motivation for an 
act should enter into coding or rating. ' > 

THE EFFECT AS A FRAME OF REFERENCE. A Specific aspCCt of the 

context problem mentioned above is the influence of the effect of the 
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with a total simple participation as the unit. A lengthy verbal state- 
ment too frequently contains elements wliich permit classification 
in several categories. 


Sampling Methods in Observation 

It often happens that one cannot, or does not want to, record 
all of the behavior that takes place in a gi\cn situation. This may 
occur because the meeting lasts too long to observe continuously, 
too many persons are present, interactions come too rapidly to re 
cord, or for any of a variety of reasons. In such a case we aje forced 
to develop a method of obtaining a representative sample of the 
behavior being observed. This can be done in a number of ways: (1) 
Attention can be concentrated on the behavior of a few of the 
members, ignoring the others present. (2) Attention can be directed 
to each penon, or to a number of persons in the group, each for 
f (3) The whole observation instrument can 

e ivi ed into parts and the social setting can be observed in terms 
of each part of the observation schedule for a standardired length 
0 ime. (4) Observations can be made only when ceruin key be- 
haviors have been introduced into the meeting. (5) The most fre- 
quen y use^ system for obtaining representative samples of the 
j*”? observed is the lime-sampling system, in which a 
nia.. selected during which observation takes 

tinn^ f if ** ^hat these parts will be an adequate descrip- 
tion of all of the events. • 4 

has sampling procedures unless one 

rnmiL suide the selection of the sample. As an 

LehlvTol ' observe a number of 

time <jamr»r” ^ but for some reason it is necessary to use a 

of a “re. Wc make the decision to observe in terms 

in? whirh ° categories for five minutes, skip ten minutes (dur- 
and rptiirn f c observing with a different set of "spectacles"), 
and so on An'^ soother five minutes svith the original categories, 
ties and ^ Situation, hoxvevcr, is a changing set of activi- 
miniitei >scover that certain events occur during the ten 

of thp T ^ observing winch distort the representativeness 

rifrhti ^ during our time samples. Tfius, we svould 

g y e suspicious of the adequacy of our data We could have 
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the observers are thoroughly familiar with the theory and purposes 
of a study, the trainer can communicate uith neophyte observers 
concerning the inner workings of the processes being observed, the 
boundary conditions surrounding certain categories, and the need 
for perceiving certain familnr behavior in an unusual fashion 

2 If possible, It is helpful if the trainees attempt observation 
Without the aid of a refined observation schedule This means that 
they would watch a situation comparable to one they would see in 
the actual study and would attempt to identify as many of the rele 
vant behaviors as they could The discussion among the oDservers of 
the events and behaviors they have seen accomplishes several things 
First, It makes the observers aware that behavior can be seen which 
might not ordinarily be noticed unless they vvere looking for it Thev 
become sensitive to the fact that the behavior, or other variables 
which they will be asked to describe, can be found if they are set 
to watch for them Secondly this unskilled performance on the 
observers part will reveal the need for operating in the future in 
terms of a carefully defined set of categories The need for caieful 
agreement among them as to what to watch for and how to record 
the behavior will become apparent as they discover that there nre 
disagreements among them as to what happened as to the signifi 
cance of certain events, as to what to call certain activities and so on 

3 The observers are ready, now, for a more refined instrument 
and the observation schedule may now be introduced We are assum 
mg that the process of developing it has already been completed 
and that the schedule is prepared for use In our experience the 
observers are always struck by the complexity of the observation 
instrument, no matter how simple it may be A set of observation 
categories or rating scales wjJI look more complicated to the observer 
ni first glance than it turns out to be in practice IF one counts on 
this proper reassurances and practice procedures can be planned 

Each of the items m the observation sch'edule is explained and 
ijuestions are answered In most cases it is helpful if each observer 
is provided with an instruction booklet which describes the purpose 
of the study, the purj>ose of each item to be observed, aics which 
may be us^ for each category, the solution for certain margiml 
cases, the nature of an adequate notation sampling instructions 
other procedural instructions, etc Obviously, the more the obiCrver 
is required to make inferences or interpretations of given phe 
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helps one to make decisions. If one simply needed to know the fre- 
quency of oral statements which followed each act of the leader, 
time sampling could be used with little fear that the data would 
be inaccurate. If, however, one needed to know how constructive the 
behavior of the group was after a given act of the leader, one would 
find that the whole sequence of events subsequent to the act would 
always need to be observed and that a time-sampling procedure 
might cut off observation in the middle of the psychologically 
meaningful data needed. 

To summarize, there are certain criteria which will help us 
decide whether we should use time sampling in observation. It 
should be used only when it is apparent that all of the behavior 
whidi is relevant to the study cannot be recorded. The theory of the 
stu y can guide us in selecting the best sampling procedure so that 
It does not destroy the reliability or the psychological meaningful- 
ness of the material being noted. 


training observers 

S*'oup observers means that we arc using people as 
will 9 IT® ‘f^’t^ments. A good measuring instrument is one which 
mpaMir*. ^ measure at Vinous times what it is supposed to 
must >1 t is to become an “observing instrument,” he 

Tsi^nl. " This may be 

an irnoonam “pon the data needed. It is 

servers and ^ preparation of a study using group ob- 

ohservatinn ^di attention. A well-developed 

worth to 

successful »nd ^ t.T P°mt out practices which have been found 

ofthetheore^d'thp P™“” typically begins with a description 

tion, for dotag well'^Mo^r"'" ™son, and thus a motiva- 

eaplains why fhe obse™Ln'”‘’hI!r”,'’ 

y servation schedule is constructed as it is. Once 
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a study and to discuss them inteUigently but that some of them 
may not be able to “see” these things in a group of interacting 
people A study by Luszki (11) provides some insights which are 
relevant She found that there is a positive relation between sensi 
tivity to the feelings and emotions of others (empathic ability) and 
the ability to “see ’ what is happening in a social situation Persons 
with good empathic ability (I) are better able to see what is hap 
pening in the role performance of otliers, (2) have good personal 
adjustment, as measured b) the instruments used in the study, 
(3) have insights into themselves which are similar to the evaluations 
made of them by others (4) have more stable, positive, and secure 
feelings about the self and somewhat favorable perceptions of others, 
(5) have a more consistent and more favorable perception by others 
One may assume, then, that all persons will not do equally 
well as observers and some testing of competence may force the 
experimenter to retire less capable ones from the research 

As we have already stated, the observers often make valuable 
proposals for improving the observation procedures Thus, it is iin 
portant that they have the opportunity to participate in making sug 
gestions In some cases, in fact, the training may be more cHective if 
the observers participate in all stages of constructing the observation 
schedule Bavelas (3) trained two groups of observers One was given 
training in the use of a prepared observation schedule in a manner 
similar to that described earlier, (he other group participated in 
the construction of the categories from the very beginning In the 
light of its knowledge of the purposes of the study, the group was 
able to determine the nature of the categories, the definition of 
each, and the rules for their use Bavelas reports, in this unpublished 
study, that those observers who participated in the development of 
the observation scheme were trained more quickly (as measured by 
the length of time it took to achieve good reliability among them 
selves) than were those observers vvho were told what the categories 
meant and hovs they vserc to be used, even though the latter group 
had more instructional time given to them 


RELIARILm' AND VAUDm' 


A detailed discussion of rcliabilit) and validii) is contained in 
Chapter 6 Tlic treatment which follows here concentrates on the 
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nomena, the more specific and detailed must such instructions be 
to achieve clarity o£ understanding. 

4. The observers are now ready for trial use of the form. It is 
usually adequate to make the first attempts while they observe a 
group role-playing the type of behavior they will be observing. The 
value of such a procedure is that any trainee may stop the role- 
playing whenever he has a question concerning the proper coding 
for a given event. 

5. This try-out is followed by extensive discussion of the experi- 
ence. The trainees will have had many problems in selecting proper 
categories, sampling, keeping up with the events, etc. These ques- 
tions are answered by discussion and further practice. It often hap- 
pens that new observers have suggestions for revising the observation 
schedule which make considerable improvement in its efficiency 
as a measuring instrument. 

6. Whenever the study allows, it is advisable that the observers 
have an opportunity to make a pilot run on a group like those they 
will be asked to observe. This will assure that uniform practices 
are developed for such problems as introducing the observer, eti- 
quette of the observer while watching the group, and any additional 
problems which arise in the actual t«t situation and which were not 
foreseen in the role-playing and schedule-development phases. 

7. Either the pilot run or a later trial will serve to provide data 
tor determining whether the observers are doing a comparable job. 
Whenever possible, it is best, of course, that the research worker 
be assured that his team of observers are reliable before the actual 
data gathering begins. 

In general, one can expect that the observers will have the 
greatest problems on those categories which require integration or 
collation of complex phenomena. They will have the least difficulty, 
in contrast, with those fvenis which are simple objective occurrences 
which require little insight or sensitivity on the part of the ob- 
serven. Studies of coder reliability (7) have found that there is most 
disagreement on data which arc complicated and demand much 
inference. Since an observer functions as a highly trained coder, it 
is quite likely that he will have similar problems. 

This suggests that the skills required of an observer cannot be 
performed by all persons equally-well. quite aside from the academic 
training they may have had. One may find that a group of observen 
have comparable ability to understand the phenomena involved in 
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social situation might differ in their receptivity to influence attempts. 
If, then, the measure of hostility shown predicted the amount of 
change which actually occurred, one might argue that the validity 
of the measure had been established. The statement made earlier, 
that the theoretical formulation must be precise, ought to be under- 
scored. In other words, the conclusion concerning the validity of a 
measure in the last instance given depends on whether or not this 
relation might be predicted. 

Sometimes experimenters are satisfied with the validity of an 
instrument when it relates in a significant way to some other vari- 
able. This is a pure dase of being satisfied with the validity of an 
instrument on the basis of its predictive efficiency—it is what Cron- 
bach (4) refers to as empirical validity, in contrast to logical validity. 
Logical validity asks: "Is this measure measuring what it is supposed 
to measure?" 

One of the difficulties involved in establishing external criteria 
in social observation research is that social-psychological theory 
often prevents one’s having faith in what might superficially appear 
to be a satisfactory external criterion. Suppose, for example, that 
observers were asked to note the extent to which the group members 
were satisfied with the leader. It might superficially appear that 
one way of checking on the validity of this observer rating would 
be to ask the members of the group themselves how satisfied they 
were with the behavior of the leader. There are good reasons, how- 
ever, for thinking that in many circumstances it would be impossible 
to accept the report of the members as a validity check. There 
might, for example, be strong social pressures against reporting 
dissatisfaction with the leader. Thus, in many social-psychological 
experiments, the relation between obsej^’er ratings and external 
criteria is a theoretical and not a methodological problem. 

Most of the validity problems with social observ'ation schedules 
arise in connection with category or rating systems requiring a great 
deal of inference on the part of obsers-ers. There arc many category 
systems which require little Inference, and the v.alidity is estab- 
lished by the content. In these cases the validity question is not 
important or critical. For example, the problem-solving category 
system described earlier which has been utilized in describing the 
problem-solving behavior of a group has in it cat^orics such as 
"Proposes solutions," "Gives information." These categories have 
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unique problems which arise in connection with observational 
instruments. 


Validity 

The problem oE the validity of measures of social interaction 
has, for a number of reasons, often been slighted. A brief discussion 
of these reasons may illuminate the problems which arise. 

One way of phrasing the problem is to say that the validity o 
a measure varies with the degrM of relation it has to an external 
independent criterion. This, as Stouffer points out (IS), is a problem 
of prediction. As was indicated in Chapter 6, there are actually two 
kinds of prediction situations. In the one case there is a generally 
accepted measure of the variable in question and the validity of 
the new measure being developed is determined by its relationship 
to this other, independent measure. One of the reasons for neglect- 
ing the validity of measures of social observation has been the lack 
of such external aiierion measures. It is difficult, for example, to 
think of an external criterion of a category such as “Shows de« 
pendence.” 

The second kind of prediction situation might be described as 
the prediction of a relation which is derived from a well-formulated 
theoretical system. In this case, if the proposed measure makes dif- 
ferential predictions concerning the behavior of people in a social 
situation and these predictions are confirmed in an experiment, 
one could argue that the validity of the measure had been demon- 
strated. Let us lake a simple illustration. Suppose that we had 
available to us a valid measure of feelings of hostility toward 
authority figures. Suppose, further, that our theory led us to state 
the conditions under which hostile behavior in a social situation 
would develop a» a function of these feelings. .\nd suppose that, 
on the basis of the test scores, we oi^nized two groups, one com- 
posed of hostile and the other non-hostile people, and that we 
observed them in such a situation. If, then, our observer measure 
of “amount-of-hostility shown** differentiated between the two 
groups, we could say that it was a valid measure of hostile behavior. 

Sometimes the kind of prediction one makes is even less direct 
than in i^is illustration. For example, one might predict that two 
groups of people differing in the amount of hostility shown in a 
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the case of a category system, breaks down into at least two separate 
reliability problems. The first is the extent of agreement among 
observers with respect to the number of units coded. This is essen- 
tially the extent to which the observers agree as to the boundaries 
of the unit which is to be categorized. Guetzkow (7) has pointed out, 
as has Bales (2), that unreliability in this area is an important factor 
affecting the reliability of categorization itself. It seems clear that 
until there is rater agreement as to the boundaries of units to be 
categorized, there is little purpose in assessing rater agreement on 
categorizing. The second principal task in assessing reliability is 
to determine the extent to which observers agree on the category or 
rating they assign to a specific unit of behavior. 

The most frequently used statistic in appraising degree of 
agreement between observers has been the correlation coefficient. 
This is especially useful, provided the assumptions underlying its 
use can be met, because the extent of agreement that is being ob- 
tained can be evaluated in terms of a fixed standard. It is a separate 
question, however, whether in a given case one needs a correlation 
of 0.7, 0.8, or 0.9 to be certain that one’s degree of agreement is 
satisfactory. It is apparent that the question immediately becomes: 
satisfactory for what? In other %vords, the experimenter or investi- 
gator must ask himself the purpose of his observational scores. His 
theory will indicate the extent to which large or small differences 
are to be expected. It is a truism that where fine differentiation is 
necessary the scores must be more reliable than where gross differ- 
ences can be expected. It is impossible to state categorically that 
observational scores should be at such and such a level of reliability 
to be useful, for the usefulness of a score depends on the use 
to which it is to be put. 

The second principal statistical device which has been utilized 
in characterizing amount of agreement between observers has been 
a percent agreement score. This Is essentially a matter of computing 
the percentage of the total number of items which were classified 
in the same way by the two observers or by all the observers com- 
bined. Sometimes it is useful to modify the equation slightly by 
having as the numerator the percentage of items on whlcli there 
was agreement, and in the denominator the sum of the items on 
which there w'as agreement and disagreement. It is necessary with 
cither of these equations to have some fixed number of items sub- 
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,,.hat Guttman calls internal validity (13). In ‘"ese “tegory^ystems 
Te validity is established on the basis ot acceptab.lity. There ts 
aLptance, for exanrple. d.t a 

that an item of information is an item of informa • ‘ J 

based on this common acceptance requires more 1 !'“/'^*'”"® 
a precise definition of the cues to be used in assigning a p 

^ AnS Sson'for neglect of validity problems is 
of the variables with which social psychologists are conceme . y 
are frequently complex, highly inferential variables for wh'® ade- 
quate external criteria are rarely available. The social psycho g 
with their social observation systems are thus often in muc 
same situation as are clinicians with their complex, highly m er 
ential diagnostic categories which are based on projective-test resu u. 

In spite of these admittedly sizable obstacles to the esta w 
ment of the validity of an observational system, it is clear that 
researchers in this area have not given the problem of validity as 
much consideration as it should have received. Many of the tec 
niques mentioned in Chapter 6 for establishing the validity for a 
measuring instrument arc appropriate for use in observational sys* 
terns. The reader is referred to these for precise designs which mig 
meet his validity problems. 


Reliability 

The reliability of observational instruments has been much 
more a matter of concern to invesiigalors of social behavior than 
has validity. For purposes of exposition it may be useful to make 
a distinction that is not often made. In assessing the reliability of a 
system of social observatiqrti it is necessary to differentiate the 
bility of the behavinr^iiemg observed Irom the reliability ® 
categorization or rating which is made of that behavior. In ot er 
words, the reliabiiity of the observer and of the behavior are sep- 
arate problems. Ill is clear, of course, from this distinction that t e 
consistency of bef avior is a substantive problem, whereas the con- 
sistency of an ob icrver is a methodological problem. Once t le 
consistency of an t bserver has been established, it becomes possi e 
to tackle the prob em of the consbtency of the behavior. 

The task of a sessing degree of agreement among observers, in 
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the case of a category system, breaks do\vn into at least two separate 
reliability problems. The first is the extent of agreement among 
observers with respect to the number of units coded. This is essen- 
tially the extent to which the observers agree as to the boundaries 
of the unit which is to be categorized. Guetzkow (7) has pointed out, 
as has Bales (2), that unreliability in this area is an important factor 
affecting the reliability of categorization itself. It seems clear that 
until there is rater agreement as to the boundaries of units to be 
categorized, there is little purpose in assessing rater agreement on 
categorizing. The second principal task in assessing reliability is 
to determine the extent to which observers agree on the category or 
rating they assign to a specific unit of behavior. 

The most frequently used statistic in appraising degree of 
agreement between observers has been the correlation coefficient. 
This is especially useful, provided the assumptions underlying its 
use can be met, because the extent of agreement that is being ob- 
tained can be evaluated in terms of a fixed standard. It is a separate 
question, however, whether in a given case one needs a correlation 
of 0.7, 0.8, or 0.9 to be certain that one's degree of agreement is 
satisfactory. It is apparent that the question immediately becomes: 
satisfactory for what? In other words, the experimenter or investi- 
gator must ask himself the purpose of his observational scores. His 
theory will indicate the extent to which large or small differences 
are to be expected. It is a truism that where fine differentiation is 
necessary the scores must be more reliable than where gross differ- 
ences can be expected. It is impossible to state categorically that 
observational scores should be at such and such a level of reliability 
to be useful^ for the usefulness of a score depends on the use 
to which it is to be put. 

The second principal statistical device which has been utilized 
in characterizing amount of agreement between observers has been 
a percent agreement score. This is essentially a matter of computing 
the percentage of the total number of items which were classified 
in the same way by the two observers or by all the observers com- 
bined. Sometimes it is useful to modify the equation slightly by 
having as the numerator the percentage of items on whicli there 
was agreement, and in the denominator the sum of the items on 
which there was agreement and disagreemenL It is necessary with 
cither of these equations to have some fixed number of items sub- 
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jected to classification. Guetzkow (7) has provided statistical devices 
for evaluating the level of agreement. 

Bales (2) illustrates the use of the chi square to evaluate degree 
of observer agreement. He points out that his use of chi square is 
different from the more conventional applications since his use 
applies it to a situation which does not represent random sampling. 
The principal advantage of chi square is that it does not require 
the assumptions of the usual parametric techniques. 

It seems useful to point out that the investigator ought to be 
concerned with the reliability of the measure actually used in the 
analysis of the data. For example, it is a matter of relative unim- 
portance whether observers agree with the respect to the number 
of units of behavior assigned to a specific individual if the score 
with which the investigator is concerned is the number of units 
in each category made by the group as a whole. In that case the 
reliability of individual scores is not of concern as long as high 
group reliability is present. Another illustration is the reliability 
of the categorization of each unit as opposed to the reliability of a 
category score for an individual. That is, ‘the agreement between 
observers as to the percentage of responses categorized in each of 
the categories in the system may be very high even though the 
observers show relatively low agreement on the categorization of 
each item. This condition occurs, of course, only when the errors 
of categorization are fairly random. The point is, however, that the 
investigator must ask himself what score he is going to use and must 
then assess the reliability of that score. 

The section concerned with problems involved in the organiza- 
tion of a category system emphasized certain decuions which must 
be made, many of them with considerations of reliability in mind. 
The immediately preceding section, concerned with the training of 
observers, also emphasized certain problems with the ultimate 
reliability of the scores in view. Detailed procedures, statistical and 
design-wise, for determining the reliability of an observer rating 
or observer categorization system can be found in most textbooks in 
statistics. However, let us make two final remarks. 

It has been demonstrated that the reliability of ratings increases 
with the number of judges. This, of course, depends in part upon 
the particular trait and the manner of making the ratings, but the 
generalization seems to hold. In any case, the reliability should be 
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determined for each set of conditions, and the degree of reliability 
desired by the investigator wiH then determine the number of 
judges he will need Using more judges to increase the reliability 
of observer scores does not seem to us to be a satisfactory substitute 
for improved precision in definition of the category or the trait 
to be rated, nor should it be used in heu ot more precise elaboration 
of cues to be used by the observers in assigning behavior to cate 
gones, or instead of adequate training of observers 

In some cases the investigator may find it useful to combine 
categories to obtain a satisfactory level of rater agreement This 
IS true because often the lines of demarcation are fine and the dis 
unction IS difficult to make Where it is possible, without introduc 
ing conceptual confusion, the investigator may expediently combine 
categories between which observers have difficulty discriminating 


THE RELATION OF THE OBSERVER TO THE 
GROUP BEING OBSERVED 

One of the most frequent questions addressed to the invesli 
gator who reports a study which used observers is Did the pres 
ence of the observers influence the behavior of the group? Usually 
he must answer that he has no evidence that the observers mflu 
enced the results in any vsay but that they might have It is quite 
conceivable that the presence of an observer may be an important 
variable This depends upon the nature of the group, the type of 
observation the nature of the groups activity, the nature of the 
variables being observed andanumbei of other things Arsenian (1) 
found that the simple presence of an adult silting near the door 
seemed to lend assurance to a group of nursery school children 
Yet the presence of observers was a threat to young boys at a summer 
camp,»according to Polansky (12) The influence of observers is t 
methodological problem which needs more careful study 

Deutsch (5) found that the members of small groups which met 
frequently over a period of three weeks v\ere aware of the presence 
of observers at the beginning of their work together but had become 
almost oblivious to them by the end of that period Half of these 
groups were constructed in such a way that the members cooperated 
with one another to achieve a common goal the rest of the groups 
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were in a situation in which the members competed with one 
another. It is interesting to note that the competitive groups were 
much more conscious of the observers than were the cooperative 
groups. Thus, procedural differences within a group may influence 
the way in which the observers are perceived. 

Polansky (12) noted that changes took place in the perception 
by boy campers of the purpose and role of cruising observers. Early 
in the summer, research observers were well accepted by the camp- 
ers, who had been told that the strangers were neutral persons who 
were interested in learning how they could improve the camp. By 
the third week, liowever, they had become the objects of aggression 
by many of the boys. The observers decided that their role was too 
ambiguous and represented a threat to these boys, many of whom 
were rebellious against adult authority. It was fell that the campen 
were projecting their feelings of hostility toward adults onto the 
observers. After the observers changed their behavior to become 
more warm, human, and friendly, they found that they were no 
longer rejected by the campers. 

Naturally, we must be cautious about geneializing from such 
an experience as this. In a laboratory situation, for example, where 
the group members are working toward a decision, it would no 
doubt be quite dlsrupihe for an observer to indicate that he feels 
warm and friendly toward the participants. In short, in some situ- 
ations we may want one perception of the observer; in another 
situation we may need quite a different one. 

Bales (2) has used observers in a wide variety of laboratory 
arrangements. They have sat with the group, or behind a one-way 
screen with the group aware they arc there, or behind the screen 
with the group wondering whether they are there. He has found 
no difference in the behavior of the groups which could be attrib- 
uted to the influence of these various positions of the observers. 

^ Thus, it appears that group observers sometimes influence a 
social situation. When in doubt, it is wisest to explain the function 
of the observers to the group in some way that does not spoil any 
necessary naivete of the subjects. Tlie observers will do best, barring 
any special conditions such as those met by Polansky (12), by show- 
ing all of the external signs of a piece of furniture. If they behave 
like a person who is having no reactions to the events being ob- 
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served, the group members will perceive them as such. For most 
situations that is probably best. 


SUMMARY 

The great incieasc in the past few years in the use of observation 
methods has been accompanied by an increase in methodological 
sophistication. Although there is still much methodological research 
to be done, there is a considerable body of experience to guide the 
experimenter. 

There are two principal types of observation instruments: cate- 
gory syst ems and rating scajes. A category system consists of one or 
mdre tafe^ries or statements describing a class of phenomena into 
which observed beitavior may be coded. Category systems differ on 
the following dimensions; Exhaustiveness^ho^^r much of the total 
observable behavior is classified into a defined category; Inference— 
the amount of inference (concerning motives, feelings, etc.) required 
from the observers; Number of dimensions— iht number of different 
frames of reference required by the system; Discrete vs. co nt inu ous— 
the extent to which the categories can be'^der^’on some con- 
tinuum. There is no simple answer to such questions as: How 
exhaustive should the system be? How much inference should I 
require? The principal factor influencing the decisions in these cases 
should be the theoretical framework guiding the research. The 
second set of considerations affecting the decisions should be those 
concerned with the competence or trainability of the available 
observers, since decisions in these areas have important bearing on 
reliability and validity of observer scores. 

Rating scales have been used to describe the behavior of indi- 
viduals as well as the activities of an entire group. They may be 
used to record behavior at frequent intervals throughout an inter- 
action period, or to assess the nature of an entire social event after 
it has ended. Rating scales are particularly useful when a large 
number of factors are under consideration and the notation of the 
occurrence of behavior relevant to these factors would be an enor- 
mous task. They are also especially helpful in the early stages of an 
investigation as a device for the development of clear definitions 
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of dimensions and the specification of the relevant cues Under ideal 
conditions, rating scales should provide data which are strictly com 
parable to those obtained with a category system with continuous 
categories 

In the construction of an) observation system, the investigator 
must make deasions concerning the £rame(s) of reference to be 
used by the observers This involves deciding (1) how much of the 
context of an act is to be taken into account in the coding or rating, 
(2) whether the judgments as to the intent of the actor are to affect 
the classification of the act, (3) whether the consequences of the 
act are to affect the observers* scoring Failure to deade these issues 
IS a common source of lack of agreement among observers Another 
deasion area has to do with the sue of the unit to be categorized 
or rated Units may vary from single acts to total meetings Failure 
to specify clearly the unit to be assessed also affects interobserver 
agreement Finally, there is the problem of deciding whether to 
record all of the relevant behavior in a given interaction period 
or to sample the behavior In general, there is considerable risk 
involved in sampling unless one has adequate theory to guide the 
sample selection 

The use of observers means the use of people as measuring 
instruments This requires the careful calibration of personnel The 
training process requires that the observer (1) be familiar with the 
theoretical framework of the purposes of the investigation, (2) have 
cxpenences m which they become sensitive to the dimensions under 
conn eraiion, (3) have extensive training experience with the pro- 
posed ol«crvation schedule, including a trial run, preferably with 
a group like that which they will be observing 
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Zhe Maly sis 
of 'Data 


There are few, if any, problems of analysis which are 
peculiar to research in social behavior. There are, however, three 
general problems which, although not unique to this area, assume 
major importance for researchers in this field. The three chapters 
in this section deal with these three major problems. 

Much of the data in social research is collected in what may 
be called qualitative form. The techniques of translating such data 
to a form which can be subjected to more rigorous analysis are 
presented in the first chapter of the section. 

Problems of scale construction are prevalent in almost any em- 
pirical field. It seems, however, that the solutions to these problems 
which have proved adequate in other areas do not handle adequately 
most of the scaling problems encountered in soda] research. The 



second chapter in this section presents an approach to scaling which 
is perhaps more adaptable to this iield. 

Testing the statistical significance oE research findings has 
always been a problem where the findings are based on relatively 
few cases and the distributions are highly irregular. This situation 
is a relatively common one in much of the research done in this 
field v-'ith which we are concerned. The development of "distribu* 
tion-free” statistical techniques may solve some of these problems. 
The last chapter in this section describes the more important of 
these techniques. 



CHAPTER TEN 


Analysis of Qualitative 
Material 


Dorwin P. Cartwright 


One of the basic skills required of the social psychologist is 
that of analyzing symbolic or “qualitative” material. A remarkably 
large portion of modern social-psychological research consists in 
classifying, ordering, quantifying, and interpreting the verbal and 
other symbolic products of individuals and groups of people. In 
this chapter we shall consider some of the kinds of materials which 
may be analyzed systematically, the major principles involved in 
converting symbolic “phenomena” into scientific “data,” some 
criteria useful in guiding dedsions that must be made in con- 
structing the system of categorization, and some practices found 
to be helpful in the actual process of categorizing symbolic materials. 

The special problems associated with collecting and recording 
symbolic behavior and those involved in the statistical manipulation 
of the processed data are treated in other chapters of this volume. 
Although it is convenient to discuss these separate topics in separate 
chapters, it is important to realize that, in practice, decisions about 
the analysis of these materials cannot be made apart from the total 
plans for the collection and statistical treatment of the data. The 
ways in which the data are collected will set up severe limitations 
on the types of analysis which will be feasible. And, in turn, the 
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kind of analysis done will limit the manner of statistical iicaimcnl 
that will be permissible and effective 


IMPORTANCE OF TREATING SYMBOLIC 
PRODUCTS SCIENTIFICALLY 


Social psychologists are concerned with the analysis of qualita 
tise material for tsso primary reasons The proper subject matter 
of social psychology consists, tn large measure, of verbal and other 
symbolic behavior a, it ts found in society Methods must be devised 
T 1 But social psichologists do not 
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The s)siematic description of these phenomena by social scieii 
"ists involves the recording of these symbolic products in an orderly 
ashion, classifying or categorizing them, and determining their 
[uantitative incidence and interrelations If these procedures are 
arried out in a proper way, objective and general statements may 
Jie made about them 


’^ualitatiie Alaterial Created by Social psychological Research 

Many of the techniques of research developed by social psy 
^•thologists have as their end product verbal or other symbolic 
,^material The research interview is a principal example of this 
^technique Here the researcher, by asking questions, stimulates ver 
,bal behavior which he hopes will provide indicators of certain 
^characteristics of the individual or of Ins relationships with others 
_jSuch variants of the interview as projective tests, stimulated themes, 
^life histones, and the like are of a similar nature Experiments 
^in the laboratory and in the field also produce materials that must 
be submitted to systematic analysis 

In research where the s)mbolic material is specifically stimu 
lated, this material is usually taken to be indicative of something 
beyond itself A particular statement, for example, given by a re 
spondent in an interview has significance to the researcher because 
^ It may be taken to indicate the presence of a certain attitude, value, 
cognitive structure, or the like The qualitative analysis of such 
statements therefore must proceed in a v^ay that will make it possi 
ble to describe clearly to other scientists how the conversion was 
nnefe from a particular range of qualitative phenomena to a specific 
genotvpe or hypothetical construct 


CnlegoTizing Qualitative Materials 

When the social psychologist has obtained a set of qualitative 
materials either from lecords of natural social phenomena or as 
products stimulated by a research project he will want to classify the 
content into apjiropriate categories so that he can describe it in an 
orderK way This process of classification into categories is com 
monlv known as ‘content analysis or ‘coding” The former term 
IS more frequently used in reference to quah'itivc imtcinis ic 
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kind of analysis done v,il\ limit the manner of statistical tieatment 
that ivill be permissible and effective 


IMPORTANCE OF TREATING SYMBOLIC 
PRODUCTS SCIENTIFICALLY 

Social psychologists are concerned with the analysis of qualita 
live material for two primary reasons The proper subject matter 
of soaal psychology consists in large measure, of verbal and other 
symbolic behavior as it is found in society Methods must be devised 
to treat this behavior analytically But social psychologists do not 
confine themselves simply to recording and describing svmbolic 
behavior as it is found in real life , they also construct situations 
designed to elicit symbolic behavior under more controlled condi 
lions In a certain sense, they create symbolic materials so that they 
may analyze them in keeping with the objectives laid doum in the 
design of these contrived situations 

Qualttative Malertal as Natural Phenomena 

When one stops to think of it, it is really surprising hoi\ much 
of the subject matter of soaal psychology is in the form of verbal 
behavior The formation and transmission of group standards 
values attitudes, and skills are accomplished largely by means of 
verbal communication Education in the schools m the home, m 
business, in the neighborhood and through the mass media is 
brought about by the transmission of information and bv the 
exercise of controls vvhich are largely mediated through written 
or spoken words If one is concerned with problems of social organ 
ization the situation is similar Supervision management, coordina 
lion, and the exertion of inBuence are principally matters of verbal 
interaction Soaal and political conflicts although often stemming 
from divergent economic interests and povser, cannot be fully 
understood without studying the words employed in the interaction 
of conflicting groups, and the process of mediation consists largely of 
talking things out The work of the world and its entertainment too 
IS in no small measure mediated by serbal and other s'mbolic 
behavior 
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The s)stematic description of these phenomena by social scieii 
lists involves the recording of these symbolic products in an orderly 
fashion, classifying or categorizing them, and determining their 
quantitative incidence and interrelations If these procedures are 
carried out in a proper way, objective and general statements may 
be made about them 


Qttahtatiie Material Created by Social psychological Research 

Many of the techniques of research developed by social psy 
chologists have as their end product verbal or other symbolic 
material The research interview is a principal example of this 
technique Here the researcher, by asking questions, stimulates ver 
bal behavior which he hopes will provide indicators of certain 
characteristics of the individual or of his relationships with others 
Such variants of the interview as projective tests, stimulated themes, 
life histones, and the like are of a similar nature Experiments 
in the laboratory and in the field also produce materials that must 
be submitted to systematic analysis 

In research where the s)mboIic material is specifically stimii 
lated, this material is usually taken to be indicative of something 
bejond itself A particular statement, for example, given b) a re 
spondent in an interview has significance to the researcher because 
It may be taken to indicate the presence of a certain attitude, value, 
cognitive structure, or the like The qualitative analysis of such 
statements, therefore, must proceed in a way that will make it possi 
ble to describe clearly to other scientists how the conversion was 
maefe from a particular range of quafitative phenomena to a specifTc 
genotvpe or hypothetical construct 


Categorizing Qualitative Materials 

When the social psycliologist has obtained a set of qualitative 
mitenals either from lecords of natural social phenomena or as 
products stimulated by a research project, he will want to classify the 
tonient into appropriate categories so that he can describe it m an 
orderlv way This process of classification into categories is com 
monlv known as content analysis* or coding' The former term 
IS more frequently used in reference to qinli'ativc rnatciials rc 
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corded from nature; the latter is more commonly employed in the 
analysis of materials created by the research. "Coding” is used espe- 
cially to refer to the process whereby answers to interviews are 
categorized. However, no universally accepted usage has emerged 
to distinguish one term from the other. 

In an excellent discussion of the field of content analysis as it 
has developed in communication research, Berelson proposes the 
following definition: "Content analysis is a research technique for 
the objective, systematic, and quantitative description of the manifest 
content of communication” (6, p. 18). This is a satisfactory definition 
if it is interpreted liberally. Communication should be thought of 
as any linguistic expression, and the restriction to "manifest” con- 
tent should be removed. With these modifications, we have an ade- 
quate designation of all the kinds of analysis of qualitative materials 
of interest to social psychologists. In the subsequent discussion, we 
propose to use the terras "content analysis” and "coding” inter- 
changeably to refer to the objective, systematic, and quantitative 
description of any symbolic behavior. 


Usts Made of Content Analysis 

The most detailed summary of the many uses of content analysis 
is that of Berelson (6), who has developed a system of classification 
resulting in a listing of sixteen uses of content analysis of verbal 
material. Although there are several alternative ways in which the 
work in the field could be classified, Berelson's listing is quite satis- 
factory. We reproduce it here, with some of the studies cited by him, 
in the interest of standardizing terminology. For a comprehensive 
bibliography, of publications dealing with content analysis, the 
reader is encouraged to refer to Berelson's book. 

Ihrec broad approaches to the analysis of symbolic materials 
are designated by Berelson. In the first, the researcher is interested 
primarily in the characteristics of the content itself. In the second, 
he tries to make valid inferences from the nature of the content 
to characteristics of the producers of the content or of its causes. 
In the third, he interprets the content so as to reveal something 
about the nature of its audience or of its effects. Any single study 
may or may not adopt more than one of these approaches. 

CHARACTERISTICS OF CONTENT. Interest in the first approach 
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will lead one to focus either on the substantive nature or upon the 
form of the content. Berelson lists six uses which are concerned 
primarily with the substantive characteristics of the symbolic 
materials. In the first two of these, comparisons are made among ma- 
terials produced at different points in time. In the next two, 
materials coming from different sources are compared. In the fifth 
use, the observed substance of communication content is evaluated 
against standards adopted by the investigator. And under the sixth 
heading, Berelson simply points out that the substantive character- 
istics of symbolic behavior are often analyzed by researchers investi- 
gating reactions under controlled conditions. 

To describe trends in communication content. Many investi- 
gations have been undertaken to determine changes in content over 
periods of time. If one is to establish the nature of such trends in 
communication content, it is necessary to employ comparable meth- 
ods for sampling the total flow of communication at successive points 
in time and to use the same system of classification throughout. A 
fairly typical example of a trend study is the analysis by Yakobson 
and Lasswell (52) of May Day slogans in the Soviet Union. They 
found, for example, that May Day slogans changed over a period 
of years from employing “universal revolutionary" symbols to 
“national** ones, Ojemann (40) studied a considerably different 
type of content. He recorded trends in articles on child development 
appearing in the Ladie/ Home Journal and Good Housekeeping 
and was able to show that articles of this sort were, in the early 
part of the century, much less frequently based upon “scientific 
authority” than they were by 1940. Still another kind of trend study 
is that in which public opinion is measured by sample surveys. Here, 
instead of relying upon the recording of natural phenomena to 
reveal trends, the social scientist repeatedly applies the same ques- 
tions to comparable (sometimes identical) samples of the population 
in order to detect changes of opinion. Cantril’s study (11) of Amer- 
ican attitudes toward international affairs just before and after Pearl 
Harbor, and Cartwright’s research (IS) on attitudes toward the gov- 
' emment’s inflation-control program throughout World War II are 
examples of this use. 

To trace the development of scholarship. This use of content 
analysis is essentially the same sa the one above. It is mentioned 
separately because a substantia! amount of research has been done 
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specifically to detect ticnds in the publications of scholaiK and 
scientific journals A good illustration of this kind is Allport and 
Bruners study (9) of the topics of research in psychology over a 
period of fifty years 

To disclose itilernalional differences in communication con 
tent With increasing interest m problems of international rela 
tions, social scientists are coming more frequently to study the sys 
tematic differences that exist among countries in the content of their 
major media of communication Tsvo studies comparing Germany 
and United States may be cited Herbert Letvin (34) made a compa 
rable categorization of the literature of the Hitler Youth and the Boy 
Scouts of America in terms of their goals and justifications McGrana 
han and Wayne (35) compared the major themes of the most popular 
dramas appearing in Germany and America in the years 1927 and 
1910 In both these studies substantial and rather similar differences 
were found between the two countries Other studies comparing dif 
ferent countries have been made in terms of such media as radio, 
newspapers, and textbooks, and a few comparable mter\iewing sur 
vcys have also been conducted in different countries There are 
obsiously many difficult problems of sampling and of translation 
in making such cross national comparisons, but this kind of research 
has produced some of the most useful data now available for an 
understanding of national differences 

To compare media or "^levels” of communication Students 
interested in understanding the role of the mass media in molding 
public opinion have made especial use of this type of content anal 
ysis Lazarsfeld, Berelson, and Gaudet (30) for example, studied 
differences in partisanship among newspapers, magazines, and radio 
during the 1940 presidential campaign They found all three media 
favoring the Republican side, with the magazines more strongly 
partisan than the other two Millspaugh (38) analyzed the role of 
different Baltimore newspapers in the city’s interracial relations by 
studying the treatment given a Negro accused of murder, before 
his trial, m the different papers He found sharp differences between 
the white and ‘ Negro ’ papers in the proportion of statements 
carried which were “helpful," "destructive, or ' neutral ' to the 
defendant s case Treatment of many other controversial subjects 
by the different media have also been compared 

To construct and apply communication standards Many 
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studies have been aimed at evaluating the social coninbminiis of 
the media of communication Such an evaluation can be nnde 
onl) by comparing actual performance against some sort of stindaul 
Criticisms of the media for being biased or for transmitting fiivolous 
or trashy content, or for oversimplifying issues, etc, call upon, at 
least implicitly, certain standards of what the media should be doing 
If such standards can be made explicit and precise, it is possible foi 
a content analysis of the communications acttiilly transmitted to pro 
vide objective evaluation of the media Lven if a technically cornpe 
tent job of content analysis is done, us acceptability as an e\aIuation 
will still depend upon the acceptability of the standards enifdovcd 
Much ol the lesearch in this field lus been based upon stand 
ards proposed by xnvestigatois vxho appeal lo such VMdely ^rcepteil 
cultuial standards as "fairness/' **objcctivii>/ or balance Thus 
Sussman, by assuming that radio has an obligation to give i fin 
and balanced presentation of every major social group, wav able 
to document charges of bias through a content anal\sis of about 
thiity news programs on the major networks during a presidential 
campaign She found tliat "labor v\as presented as being monlly 
wrong five times as often as it v\as morally nghi, on the other hand, 

It was presented as being strong just as often as it was presented 
as being weak" (45, p 210) Another example of evaluating per 
formance against standards is provided by the Hriiish Royal Com 
mission of the Press (4J) This group see up a list of major facts 
concerning the first year's progress of the National Coal Board 
They then checked the reporting of these actual events’ in various 
newspapers and revealed very meager coverage of what they re 
garded as socially important information 

To aid in technical research operations Under this heading, 
Berelson has grouped two of the most common uses of content 
analysis in contemporary social lesearch the coding of free answer 
interviews and the analysis of interaction among people m groups 
A thoiough discussion of the many uses of interview surveys is found 
in Chapter I The problems involved in the latter use are discussed 
in detail in Chapter 9 

The next three uses of content analysis have in common a 
focus upon the form of the content (in contrast to us substantive 
nature) The first of these is concerned with the analysis of propa 
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ganda The second derives from a practical interest in improving 
the intelligibility of written communications The third has been 
found most commonly among those interested in the study of litera 
ture 


To expose propaganda techniques Content analysis of propa 
ganda has quite often been designed to reveal the ways in which the 
propaganckst pursues his objectives of influencing the public Berel 
son points out that two broad classes of techniques have been 
studied the themes or appeals employed, and the tricks of the 
trade Examples of the former are a study of British and German 
propaganda in World War I by Lasswell (26) and an analysis by 
While (48) of the values employed by Hitler and Roosevelt in their 
speeches just before World War II Lasswell concludes for example 
that the British stressed humanitarian ideals much more than did 
the Germans and White found among other things that 35 percent 
of Hitlers emphasis units invoked the value of * strength in con 
trast to only 15 percent in Roosevelts speeches A study by Lee and 
Lee (31) illustrates research on the second type of technique Here 
«ven tricks of the trade were enumerated in the speeches of Father 
Coughlin Other studies have investigated more particularly the 
utihration of emotional content Waples and Berelson (47) for 
example constructed an index of the incidence of emotional terms 


m various media during the 1940 presidential campaign and found 
^gni cantly more emotional content in material dealing with 
Roosevelt than in that dealing with Willkie 

To measure the readability of communication materials 
^ J"*crest m the ability to grade materials on the basis of their 
tlilhculty of comprehension was displayed by educators wanting to 
nf ^ ^ mental level ot different groups 

Thi» fsn ” * schemes have been developed for this purpose 

y enjoying the greatest popularity is that of Flesch 
f two majolr components Of read 

nf reading ease, which is measured by the number 

human ,Jt Words and by the average length of sentence and 

wnrHc which IS measured by the percentage of persoml 

nf this P^tsonal sentences Interesting problems of validation 
been noted one critic showing that one such 
Jimes^ ^ makes Kurt Koff^ easier reading than William 
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To discover stylistic features. The stylistic characteristics of 
literary products have been studied quite extensively through con- 
tent analysis. Typical examples are the examination by Miles (37) 
of the ratio of verbs to substantives in poetry since the sixteenth 
century, the study conducted by Skinner (43) of alliteration in 
Shakespeare’s sonnets, and investigations to resolve questions of 
disputed authorship and of the correct chronology of a given'author’s 
works (21, 53). 

PRODUCERS OR CAUSES OF CONTENT. The sccond major approach 
to content analysis consists of the attempt to learn something about 
the nature of the producer or, more generally, the causes of the 
symbolic material from the characteristics of the material itself. 
In some situations, where the researcher has access only to the com- 
municated material and cannot study the communicator directly, 
this method is used as a matter of expediency. In other situations, 
where a person can be induced to produce symbolic behavior as a 
response to standard conditions, the characteristics of such behavior 
are often taken as a very acceptable indication of the person’s own 
characteristics. The following four uses of content analysis illustrate 
various ways in which social scientists have tried to construct a 
picture of the communicator from his symbolic products. 

To identify the intentions and other characteristics of the com- 
municators. In a number of studies the intentions and attitudes of 
communicators have been inferred from an analysis of the content 
of their communications. Unfortunately, most of these studies have 
not had tests of validity that allow a very good assessment of how 
successful the inferences as to intentions have been. An illustration 
of this type of content analysis is the study of Britt and Lowry (8) 
in which the treatment of A.F.L.-C.I.O. conflict in labor newspapers 
was analyzed to reveal how closely local leadership was following 
the official position of the national leadership. This analysis revealed 
a predominant position of neutrality by the local labor press, which 
was taken by the authors to indicate a considerable inertia on the 
part of local leaders with respect to the philosophy cf the national 
organization. Another study, in which validation is virtually impos- 
sible, is that of Leites et al. (32) of speeches given in 1919 by mem- 
bers of the Soviet elite in celebration of Stalin’s birthday. Tliese 
speeches were analyzed so as to disclose attitudes of the speaker 
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toward Stalin. Sharp dillerenccs weie found in the image of Stalin 
revealed by the old Bolsheviks and by the other speakers. 

To delenntne the psychological state of persons and groups. 
This use of content analysis has perhaps been of most value to peo- 
ple interested in the study of personality. Clinical interviews, projec- 
tive tests, life histories, diaries, letters, and other personal docu- 
ments have been analyzed for this purpose. Allport (I) Has sum- 
marized well the various techniques and objectives of this type of 
lesearch. In one study, Baldwin (3) recorded the frequencies with 
which certain themes were contiguous to one another within a sam- 
ple of letters written by a single person. The clustering of themes 
found in the letters was taken to indicate principal motivational 
and ideational clusters in the writer’s personality. Another type of 
content analysis which is designed to reveal something of the emo- 
tional adjustment of a client during treatment is the discomfort-relief 
quotient developed by Dollard and Mowrer (17). This quotient 
is computed by dividing the total number of discomfor -words by 
the number of discomfort-plus-relief words combined. Suggestive 
conclusions have been drawn from noting changes in the value 
of this quotient during the course of treatment and from comparing 
the value of the quotient for one client and another. Quite a dif- 
ferent type of study is that of the United States Strategic Bombing 
Survey (46), in which captured German civilian mail was analyzed 
to determine the effects of strategic bombing on civilian morale. The 
letters were coded in various ways to indicate evidence of low morale, 
poor health, and anxiety. These indications of demoralization were 
then statistically related to other characteristics of the writers, such 
as sex. date of writing, tonnage of bombs dropped on locality where 
t le was written, and the like. From the relationships thus 

esta is ed, certain conclusions about the effects of bombing on 
morale were drawn. Thus, for example, day raids were found to be 
rnore demoralizing than night raids^ and certain community disrup- 
tions w’ere found to be more demoralizing than others. In yet an- 
ot ler example, content analysis was used to reveal basic personality 
leadership in a study by Sanford and Rosenstock 
(4^). These authors developed a' projective device in the form of 
cartoon pictures ‘Which can be administered in .brief interviews 
on the doorstep They find that rwpbnses can be reliably coded and 
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that scores obtained in this fashion correlate well with an authori- 
tarian-equalitarian scale. 

To detect the existence of propaganda (primarily for legal pur- 
poses). During World War II the United States Department of 
Justice, making use of techniques of content analysis developed by 
Lasswell (27), introduced into the sedition trials evidence that there 
were remarkable parallels between allegedly native-fascist propa- 
ganda and the propaganda of the Nazis. In the scheme of analysis 
developed for this purpose, Lasswell subjects the material in ques- 
tion to a number of tests. For example: How parallel is the content 
of a given channel with that of a known propaganda channel? Is 
the vocabulary employed the same in certain distinctive features as 
that used hy a known propaganda channel? Does the material per- 
sistently distort statements on a common topic in a direction favor- 
able to one side of a controversy? 

To obtain political and military intelligence. In times of inter- 
national crisis, when iniJit«'>ry or political conditions throw an iron 
curtain around nations, the needs for intelligence about hostile na- 
tions assume practical urgency. During World ^Var II and since, 
methods have been developed whereby it is hoped that national 
military and political intentions can be anticipated from analyzing 
the characteristics of nationally controlled communication ma- 
terials. The Foreign Broadcast Intelligence Service was engaged 
in such activities during the last war. Although it is extremely diffi- 
cult to make at all satisfactory checks of validity on these meiliods. 
George (20) concludes from a careful study of the performance of 
these researchers (hat they made successful predictions about twice 
as often as tmsucccssfuf ones. Further o a/uatibn of this method can- 
not be made at present because much of the current research in this 
area is not' available to social scientists generally for reasons of 
military security. 

AViitESCE OR i.TFFCis OF co.vrF.NT, Ifi thc third nt.ajor approach 
to content analysis, the material is taken as a h.asis for inference 
about characteristics of the audience for whom the content is in- 
tended, or about the effects of communication. It should be apparent 
that an inference from the nature of the content to the nature of 
its audience is possible only if certain aswmptiom are made (for 
example, that the comimmicaiion correctly reflects audience Inter- 
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est) about the situation in which the content is produced. Often 
these assumptions have little empirical justification and the infer- 
ences about the nature of the audience are worthy of consideration 
only because direct observation of the audience is not possible. 
The following three uses of content analysis illustrate this third 
approach. 

To reflect attitudes, interest, and values (“'cullural patterns") 
of population groups. In several studies of the content of mass 
media, it has been assumed that the material communicated through 
these media express or reflect the prevailing thought and mores 
of the population at that time. Thus, Hart (22) analyzed the content 
of popular magazines in the United States over a period of years 
from 1900 to 1930. He found what he took to be evidence for a 
decline in the status of religion and an increase in toleration of sexual 
freedom during this period. The primary basis for these conclusions 
were changes in the amount of attention given to the topic in the 
magazines and the incidence of indicators of approving attitudes. 
Wolfenstein and Leiies (50) have drawn certain conclusions about 
the American culture of today from an analysis of contemporary 
feature films. In this study It is explicitly assumed that a movie is 
which reveals something of the emo- 
tional life of the movie going public. One of the more ingenious 
aspects o t is study is (he interpretation of American relations 
between the sexes based upon the prevalence of the “good-bad girl” 
!?• " movies, a type not so common in films of other coun- 

In all studies of this type, it is difficult to determine why the 
ors assume that the content reflects characteristics of the 
lA h ^ producers. Often the assumption seems 

equally well, or that it 
rnfrhn tf icnce because the prod ccrs are for some reason 
(perhaps by a sort of* ' - 


natural selection”) attuned to the audience. 


linn <^ttention, A slightly different assump* 

^ the audience in this use of content 
thcrc IS 3 mofc or Icss approximate 
rntrnir^vA ^twecn the Content of the mass media and the 

audiences exposed to the media (presumably 
.'nfnrm ^ 'v thc cc^itiiT content). If some item of 

t'm#. j ” stressed in the media at some place and 

1 . 1 is assumed that this will be salient in the thinking of the 
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population. Research on reading and listening behavior indicates 
that this assumption can, at most, be only an approximation and 
that various people exposed to the same content through the same 
media may react to it in quite different ways. Some examples of this 
approach may be cited. Woodward (51) showed that the percentage 
of foreign news in American morning newspapers in 1927 rarely 
exceeded 10 percent, indicating, presumably, that American people 
were only slightly aware of foreign events. Festinger, Carumght, 
et al. (18), in studying the circumstances under which an anti- 
communist rumor broke out in a local community, shbwed that in 
the period preceding the outbreak of the rumor there was a marked 
increase in the number of column inches devoted to the theme of 
domestic communism in the newspapers read by the local citizens. 
Although these newspapers did not deal at all with the specific 
subject of the local rumor, the authors argue that the media set, 
or reflected, an atmosphere favorable to the occurrence of the 
rumor. Amheim (2) conducted a now classical study of what kind 
of world is brought to the attention of listeners to soap operas. He 
showed, for example, that the world of the daytime radio serial 
deals predominantly with themes concerning personal problems 
rather than public affairs. Of 43 such serials, for example, 49 percent 
dealt with problems of courtship, whereas only 26 percent were 
concerned with public affairs. 

To describe altitudinal and behavioral responses to communica- 
lions. Berelson points out that there have been three ways in which 
content analysis has been used to study the effects of communica- 
thns. The ffnrc consiscs fn arrafyzmg materhls wfrfcA were produced 
in response to some specific communication. Lemer's study (33) 
of published reactions to The American Soldier is a good illustra- 
tion of this type of analysis. The second kind of investigation 
attempts to show empirical relations between the content of a com- 
munication and responses to it. Tlius, Berelson (5) showed that the 
greater the frequency of certain political arguments in the various 
media, the larger was the number of people who could recognize 
the argument. On the other hand, he found a much weaker relation- 
ship ^tween the frequency of appearance of the argument and 
acceptance of it. hferton (36) also attempted to relate characteristia 
of the content of media to people's reactions to it in his study of 
Kate Smith’s war bond marathon during ^Vo^ld War II He noted. 
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among oiher things that fully half of Smith s material stressed th'* 
sacrifice theme and m intensive interviews with people who heard 
Smith he traced through the peisonal meaning of diis appeal to 
listeners In the third kind of analysis direct inference is made to the 
effect of content without any reference to icsponse data themselves 
In this way Lasswell and Blumenstock (28) analyzed the themes 
employed by communists m Chicago m the 1930 s and concluded 
iliat their propaganda was lelatively ineffective because it ran conn 
ter to the fundamental values and mores of the citizens to whom 
It vsas addressed 

It should be apparent from this summary that content analysis 
has received serious attention in widely diflcrcnt fields of investiga 
tion Any evaluation of the theoretical or practical significance of 
this technique must be made as in the case of any technique in 
terms of the specific objectives set up for the research The work 
completed to date undoubtedly shov\s that this technique can be 
successfully applied to the solution of many significant problems 
Only future work can demonstrate as full potential and limitations 
It is possible however from the experience accumulated to date 
to set up certain standards vvluch should be met in the process of 
analyzing symbolic materials 

Let us turn now to a more detailed examination of the process 
y w ich qualitative materials arc converted into scientifically 
acceptable data and examine some of the principles that should gov 
ern this operation 


CONVERTING PHENOMEN \ INTO 
SCIENTiriC DATA 

1 he recording of symbolic materials as found in life settings 
le stimulation of them in contrived situations provide the 
ana ysi on y with raw materials Inspection of such materials may 
ca< ® s^^osiiive person to certain insights and conclusions and these 
may e in a certain sense correct But in the long run both 
scicnti ic and practical progress require more than sensitive insight 
(though both can certainly make good use of ir) To the extent 
t lat investigators cannot communicate to others how their insights 
are accomplished the ability to achieve them is retained as private 
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properly of individuals These conditions would produce at best 
experts and not a body of know/edge 

The objective of content analysis is to convert recorded raw 
phenomena into data which can be treated in essentially a scientific 
manner so that a body of knowledge may be built up More specifi 
cally, content analysis must be conducted so as (1) to create repro- 
ducible or objective data which (2) are susceptible to measure 
nieni and quantitative treatment (S) have significance for some 
systematic theory and (4) may be generalized beyond the specific 
set of material analyzed 


The Pioblem of Objectivity 

Suppose that an investigator has collected such materials as 
speeches made by a political candidate articles appearing m Pravda 
or the Aeiy )ork Times deliberations of the Uniied Nations 
Security Council or answers given by respondents in a sample m 
terview study How should he go about consiiucting Jus descriptive 
statements concerning these materials so that oiJier analysts can 
verify tliem independently? Four aspects of this problem of objectiv 
uy may be noted 

THE VARIABLES TO BE EMPLOYED IN 'fHF ANALYSIS OUTLINT Uh 
less there is agreement among investigators about the aspects of 
the material that are to be described there cm hardly be agreement 
m the resulting descriptions To note how many different attri 
butes can be found in the same maienal consider the brief quotation 
from an interview conducted with an mdusirnl worker iluring 
World War II 

Id like lo set all tride barriers down ufur iht war Raw 
goods should be shared where flicyre nccdt-tl Its inoncy jiu! 
nw goods and poor hi ing that ciiistcl most of this u ir \S c shotiUl 
set tint Gtrniany gets it I iir this time or wt 11 havt anoihtr war 
Russia JS fighting for her w i) of life jusi like wt are for ours 
England is fighting along with us and Russia to protect the peoph 
against fascism—to be free not slaves Churchill and Roosevelt 
and Stalin arc grcit men Tliey know hovs the people feel We 
can I stay on our side of the |»ond anymore 1 he union s taught 
me that 
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Literally scores of attributes can be found in this brief passage. 
Let us list a few: (1) number of words, (2) percentage of personal pro- 
nouns, (3) attitude toward free trade, (4) perceived cause of war, 
(5) degree of confidence in the Allies, (6) degree of confidence in 
leaders, (7) attractive traits of leader, (8) attitude toward isolation- 
ism, (9) evidence of previous isolationism, (10) source of influence 
on attitudes, (H) implied values. (12) inclusiveness of cognitive 
structure, and (13) degree of approval of Allied war aims. 

It should be obvious that many other attributes of this material 
could be listed and that disagreements about the true nature of the 
material could easily arise. Objectivity requires, therefore, explicit 
specification of the variables (sometimes referred to as "dimensions” 
or types of attributes") in terms of which descriptions are to be 
made. This is the first step in constructing the analysis outline (or 
"code"). We shall return later to the question of how these variables 
should be selected. 


THE CATEGORIES FOR EACH VARIABLE. Let US assume nosv that 
we have chosen some variable, perhaps one from the list above: 
confidence tn the Allies." There remain many viays in which this 
table may be broken down into categories. We might decide 
ha. 1 '"'^™''” “ppropciate material) so that each 

vorie thiee following cate- 

I ® classifiable in either. It should, 

confiden' 10 use seven categories: (1) Unqualified 

eouallv l^a'i A confidence, (3) Confidence and mistrust 

Qualified mistrust, (5) Unqualified mistrust. 
St^cl s 'ifiabT' ■"•“v-wer. (7) Question^sked. but ans.ver 

d tsvo a'" ««gorics. It should be apparent that 

using the firsf" same material, one 

would enm *** ° /^^'Sories and the other using the second, they 
It descriptions of thelme phenomena, 

-ire possible **'.^*^ niany other systems of categorization 
with each v ‘^ifi ^Pacification of the system of categories used 
t!.. reproducible analysis, 

nendent OEFiNtnoN for each catecorv. Two inde- 

fidence in ^gree to analyze the interviews for "con- 

trorizatinn employ the threefold system of caic- 

cotlinps To h might not agree at all in their actual 

ctxhngs. To be sure that they will agree, they need explicit rules 
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specifying what features of the content are to be taken as indication 
that it falls in one category rather than another. A statement of these 
rules constitutes the operational definition of the category. 

In drawing up such an operational definition, it is important 
to begin by designating the units of analysis that are to be used. 
There are basically two kinds of units to be specified. The first of 
these may be called the "recording unit," which is the specific seg- 
ment of the content that is characterized By placing it in a given 
category. The second kind ot unit is the "context unit," which is 
"the largest body of content that may be examined in characterizing 
a recording unit" (6, p. 135). The coder might, for example, count 
each emotionally loaded word as a recording unit, but refer to an 
entire paragraph to be sure that he records its correct meaning. In 
the coding of frce-answer interviews, the answer to a single question 
is often taken as the recording unit and a whole block of related 
questions is employed as the context unit. This procedure is fol- 
lowed because the correct meaning of an answer to a single question 
niay sometimes be appreciated only by reference to what has gone 
before or what comes after. 

A second aspect of the operational definition of a category con- 
sists in specifying the indicators which determine whether any given 
unit should fall within the category. In the example given above, 
we might have considered the category "high confidence in the 
Allies" and taken as an indicator the statement "England is fighting 
along with us and Russia to protect people against fascism." If we 
were coding a very large number of interviews, we would encounter 
many other statements which should also be taken to indicate the 
same "high confidence in the Allies.” Thus, a category consists of a 
*^nge of possible indicators, all of which are given the same label 
and are therefore handled equivalently in all subsequent treatment 
of the data. If it were possible to list all the variations of content 
which indicate a given category, such a list would provide a com- 
plete operational definition of-tlic category. Unfortunately, most 
categories with which social scientists deal cannot be defined in 
actual practice by an exhaustive listing of Indicators. 

Instead of attempting to construct a complete list, the analyst 
will find it more cfTcctivc to rely upon the ability of a trained person 
to respond to indicators in a systematic way. To respond sysiemai- 
ically, the coder needs a rationale for a given set of equivalent indi- 
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cators. Often this can be conveyed by establishing the "core mean* 
ing" or "ideal type" of a given category and then defining its 
boundaries through examples of indicators which will be taken to 
fall on each side of the boundar)'.* 

ADAPTATION OF ANALYSIS OUTLINE TO THE EMPIRICAL CONTENT. 

The most logically constructed and theoretically elegant scheme 
of analysis will not produce objective results if it does not in fact 
"fit” the material being analyzed- Try, for example, to fit the inter- 
view quoted above into a system of classification designed to reveal 
the respondent’s stage of psychosexual development. Reproducible 
coding will be possible only when the system of classification is 
properly adapted to the material being coded. - 

Lazarsfeld and Barton (29) have suggested that in constructing 
the analysis outline for use with free-answer interviews, there are 
two adaptations to the empirical material which should always be 
made. The first of these they call "adaptation to the structure of the 
situation," Thus, in the analysis of "reasons" given for a certain 
behavior, it is desirable to “build up a concrete picture or model 
of the whole situation to which the reports refer, and then locate* 
the particular report within this ‘structural scheme' " (p. 159). In 
constructing a scheme for analyzing reasons women give for buying 
a specific brand of cosmetics, one would set up variables referring 
to such things as sources of information, sources of advice, motives 
related to use of cosmetics, technical qualities of cosmetics, anxieties 
about consequences on health, considerations of expense, etc. An 
excellent example of this approach is the study of men’s preferences 
in suits, coats, and jackets, conducted by the Division of Special 
Surveys of the United States Department of Agriculture (10). In 
designing the interview and the scheme of analysis, these investiga- 
tors obtained detailed data bearing on three general areas: (1) cir- 
cumstances that led to the decision to buy the most recently pur- 
chased suit, (2) what men wanted their suits to do for them, and (3) 
the values men brought to bear upon their suit shopping and the 
means by which they thought they could attain them. Several more 
specific variables were set up in each of these areas. 

1 Catlwright and Festinger {12. 15) have demonstrated that people do 
employ atc^ries of judgment nhosc tMundaries may vary in preasion, and 
that diOicultics of catcgoriiation, as leRected in decision time, increase as the 
material being classified moses from ihe "core" of the category toward the 
boundary. 
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The second type of adaptation to the empirical material Js 
“adaptation to the respondent's frame of reference" (p. 162). The 
need for such an adaptation becomes vividly apparent whenever 
one attempts to apply a classification scheme assuming greater 
sophistication or differentiation than in fact exists in the respond- 
ent’s thinking. Cartwright (13) found, for example, in studying 
popular conceptions of wartime inflation control, that characteriza- 
tions employed by technical economists could not be used in coding 
popular descriptions of war finance. 

This requirement that we adapt the analysis outline to the 
respondent’s frame of reference has one important consequence 
often overlooked. Consider its implications for two different tech- 
niques of interviewing— the free-answer question and the fixed- 
alternative question. From the free answer, the researcher obtains 
verbal material which he must analyze according to his scheme of 
analysis. When, however, a fixed-alternative question is asked, this 
scheme of analysis is given with the question, and the respondent 
is asked to code the answer which he would have given had he been 
allowed to talk fully. Under these circumstances, if the analysis 
outline does not fit the respondent’s frame of reference, the only 
alternatives open to the respondent are to refuse to answer or to 
indicate a categorization which is not accurate. Crutchfield and 
Gordon (16) have produced convincing documentation of this danger 
by following up a fixed-alternative question with a series of free- 
answer questions designed to reveal the respondent’s frame of ref- 
erence. 

Difficulties in getting an a priori analysis sclieme to fit the verbal 
materials have sometimes led analysts to abandon efforts to construct 
an analysis outline before studying the content of the material. The 
results of abandoning these a priori considerations tend to be the 
construction of an outline whi<A reflects only the superficial or phe- 
notypical similarities and differences among the elements of the 
content. Experience suggests that it is better procedure to start 
with an analysis outline and then to adapt it in a self-conscious and 
orderly fashion so as to make it fit the content being studied. In this 
way, it is possible to examine sj-stcmatically the modifications in 
the a priori scheme which arc called for. If these modifications arc 
substantial, one may svish to conclude cither that the original out- 
line was inadcrjuatcly concchttl or that the material chosen for 
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analysis was not, in fact, appropriate. In either event, the analyst 
modifies his original conceptions in an explicit and self-conscious 
fashion. 

We should consider at this point the attempt that several inves- 
tigators have made to construct a standard, or “all purpose,” scheme 
of categories to be used in a wide variety of studies. Some of these 
schemes are extremely phenotypic, consisting of such categories as 
positive or negative affect. Others, however, are derived from a 
more or less developed conceptual system. Examples of these more 
genotypic standard schemes of analysis are Bales’ (4) categories for 
analyzing interactions in discussion groups and White’s (49) cate- 
gories for describing values employed in verbal materials. There 
can be little argument about the desirability of having standardized 
schemes of analysis so that different studies can be compared. It is 
probably no accident, however, that there are as yet relatively few 
studies conducted by independent investigators which use these 
schemes. To be satisfactory, the schemes must fit readily both a gen- 
erally accepted conceptual system and the specific content being 
employed in each new investigation. 

The Problem of Quantification 

One of the major reasons for developing an explicit and objec- 
tive scheme of analysis is that it makes possible quantification and 
measurement (provided that certain additional requirements are 
met).^ After material has been submitted to a scheme of analysis 
meeting the four requirements of objectivity listed above, it is pos- 
sible to determine frequencies, establish quantitative relations, and 
engage generally in many of the operations usually thought of 
as measurement.” The abstract features of measurement theory are 
discussed in greater detail in Chapter 6. We shall limit our discus- 
sion here to «rtain common problems and practices encountered 
in most current work employing content analysis. 

■niE UNrr or enumeration. The quantitative treatment of sym- 
bolic materials requires that one specify clearly the unit in terms of 
which quantification is performed. We shall refer to this as the 
unit of enumeration. In our earlier discussion we referred to an- 
other unit— namely, the recording unit— as that segment of the con- 
tent which gets labeled when the analyst codes the content. It is 
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important to note that these two kinds of units are not necessarily 
the same. Sometimes—as, for example, when the analyst merely 
counts the number of recording units which ^t a certain categoriza- 
tion— the recording unit is exactly the same as the enumeration unit. 
An illustration in which the two are the same might be the analysis 
of a speech given by a public official so as to reveal the number 
of times "American military strength" is employed as an argument 
for a certain foreign policy. In this case an "argument" is taken both 
as the recording unit and the enumeration unit. 

But let us consider an example in which the two units are 
not identical. One might characterize an endre editorial on foreign 
aid as predominantly favorable or unfavorable and then, for pur* 
poses of quantification, count the number of column inches of the 
editorial. In this case, a column inch would be the unit of enumera* 
tion, whereas the editorial as a whole would be the recording unit. 
Quite different quantitative results may be obtained through use 
of various units of recording and enumeration. In the latter exam- 
ple, for instance, we might just as legitimately use the sentence as 
the recording unit so as to be able to count the number of favorable 
and unfavorable sentences in the editoriaL Then we might find that 
slightly more than half of the recording units in each of the editori- 
als are favorable. We should then conclude that 55 percent, let us 
say, of the sentences are favorable to foreign aid. But if we use the 
whole editorial as the recording unit and the column inch as the unit 
of enumeration, we might conclude that 1 00 percent of the column 
inches of the editorials is favorable. The units we choose to employ 
must be determined by the purposes of the total analysis. 

In analyzing frec-answer interviews, it is customary to take a 
single respondent as the unit of enumersiion. In this way, quantita- 
tive statements arc made concerning the number of people who 
display a given characteristic. Some interview studies, however, have 
us^ the recording unit as the unit of enumeration, with resulting 
confusion. Consider an example in which this confusion arises. In 
an interview each respondent is allowed to give several reasons for 
taking a given political position— let us say, for favoring a particular 
candidate. The analyst takes each reason as the recording unit and 
then uses this unit as the unit of enummtion. Results are reported 
in terms of the number of times a (^rtain reason appears in the 
entire collection of interviews without respect to the number of 
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respondenu who mention that reason. The results are now contused. 
Suppose, tor example, that the number ot reasons turns out to 
equal the number ot respondents in the study. What ran one con- 
clude from this tact? Obviously one ran conclude very little, because 
the same result would have been obtained it each respondent had 
given one reason or it one litth ot the respondents had given five 
reasons each. This practice ot using some segment ot the content 
rather than the respondent as the unit of enumeration is some- 
times defended on the grounds that the analyst is interested in 
measuring a “climate of opinion" or “culture” rather than char- 
acteristics of individuals. A convincing logic for this procedure, 
however, has yet to be developed. 

THE SYSTEM OF CATEGORIZATION. Quantification and measure- 
ment depend not only upon a unit of enumeration, but also upon 
the existence of certain systematic relationships among the cate- 
gories. If the content is to be used as an aid to measurement, and 
if certain quantitative treatments are to be employed, the categories 
of each variable must be related to one another in certain definite 
ways. 

Lararsfeld and Barton (29), in an illuminating discussion of the 
logic of measurement, make a useful classification of the types of 
systems of categorization which are possible in coding qualitative 
materials. These 'may be referred to as (1) dichotomies, (2) serials, 
and (3) variables. 

A system of classification which employs dichotomies calls essen- 
tially for a judgment of the presence or absence of the attribute 
in question. Examples of such a scheme are a listing of reasons, or 
the counting of a certain kind of word, phrase, theme, or value. 
Here the coder looks at each recording unit and notes either the 
presence or absence of the attribute under consideration. Reliable 
coding of this type requires that an explicit judgment of "presence 
or-absence” be made. Sometimes, under the lime pressures of a 
research project, the coder skims quickly through a body of material 
looking for certain indicators and noting their presence. The logic 
of this kind of coding requires, however, that every recording unit 
for which he does not note the presence of the indicator be taken 
to mean that the indicator is, in fact, not present. A failure to be 
sure that an indicator is not present often is found to lower the 
reliability of such coding. 
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In analyzing many kinds of content, it is desirable to categorize 
units in terms other than mere presence or absence For example, 
one may want to indicate that a statement or attitude has a certain 
degree of intensity Thus, instead of categorizing the statement as 
merely present, one might wish to indicate that it reflects high, 
medium, or low intensity of feeling Such a system of categories may 
be called a serial It orders the categories in such a way that the 
coded materials may be ranked This means, for instance, that an 
indicator categorized as * high” is above those categorized as 
“medium” and “low,” tliat an indicator coded “medium” is above 
a “low” one but below a ‘ high” one, and that an indicator coded 
“low” IS below both “medium” and ' high” ones No assumptions 
are made concerning the location of an absolute zero point Most 
scales found in content analysis at the present time are serials An 
example of such a scale is the common five category scale of degree 
of satisfaction, consisting of (I) ‘ very satisfied,” (2) “satisfied,” (3) 
“neutral or ambivalent,’ (4) ’dissatisfied,’ and (5) “very dissatis 
fied ” Another example would be the four category scale of reported 
frequency of behavior, made up of (1) “always ’ (2) 'usually,' 
(3) "occasionally,’ and (4) “never” Not all serial scales need to 
have the superficial appearance of being graduated Scaling pro 
cedures, such as those developed by Outtman, may result in a system 
of categories which meet requirements of scalability without pos 
sessing (he obvious and apparent characteristics of a graduated 
series 

If a system of categories not only estabhslies a serial order but 
also designates equal intervals and an absolute zero, it meets the 
full s. •wcnafetc ® Oul^ a few ot the oadictg schemes, 

employed in content analysis meet the requirements set for a true 
variable The most common ones arc gnen m terms of time (such 
as age of respondent or duration of a radio program), monetary 
units (such as income, prices, or savings), or units of ph)sical length 
(such as distance the respondent lives from the public library, or 

I 

2 The reader will note lhai ihe lerro \anible has been used in two rather 
thlfcreni ways in this chapter Prior to this section U has referred to the type 
of attribute beiriR described by i given set of categories The more restricted 
meaning of the term refers only to such types of attributes as ate categorized by 
a system of categories meeting speaal requirements This double usage of the 
term seems unavoidable until an acceptable term is adopted for the looser mean 
Ing of the word 
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column inches of type). It is apparent that these examples do not 
refer to psychological variables and that the process of coding con- 
sists of little more than transcribing answers to the tabulation sheets. 
If it were possible to employ true variables in the analysis of psycho- 
logical material, many mathematical operations not otherwise pos- 
sible could be performed on the data (for example, treating different 
points on the scale as ratios). 

In treating categorized data quantitatively, the basic operation 
is that of counting. This is true whether the system of categorization 
is a dichotomy, serial, or variable. After the material has been 
categorized, the usual procedure is to tabulate the frequencies ob- 
tained for each category. If the system of categorization is that of 
a dichotomy, the frequency for any given category is usually calcu- 
lated as a percentage of some total possible frequency. Thus, one 
notes the percentage of all respondents interviewed who mention 
economic imperialism as a cause of war, or the percentage of all 
value statements in a speech which appeal to strength. If the system 
of categorization is that of a serial or variable, the frequencies for 
each category may be noted and such measures as those of central 
tendency and of dispersion may be calculated. 

MAJOR REASONS FOR DETERMINING QUANTITAnVE RELATIONS. 
Basically, the social scientist is interested in quantifying symbolic 
material so that he can compare different sets of material and ex- 
amine relations in a precise way. He may wish to do these things 
for any of several purposes. Let us consider briefly some of the more 
common of these. 

There are two basic kinds of questions that are raised in most 
descriptive studies: (1) How do symbolic materials vary over time? 
and (2) How do materials produced by different sources differ from 
one another? Many examples of both types of relationships were 
given in the survey of uses of content analysis presented above. In 
establishing trends over time and in comparing different kinds of 
materials, it is essential that the same system of categories, the same 
operational definitions of the categories, and the same units of 
recording and of enumeration be used in quantifying the mate- 
rials being compared. This requirement is sometimes difficult to 
meet when the several materials are quite different in content. If 
the frame of reference of the person producing the content changes 
over time, or if different frames of references are used by different 
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producers of content, the analyst may want to change his system 
of categories to produce a better fit. But if he does so, he cannot then 
make strictly satisfactory quantitative comparisons. This problem 
is particularly acute in comparisons over long periods of time and 
in comparisons between widely different cultures, and no fully 
satisfactory solution has yet been worked out. 

When it is possible to set up in quantitative terms certain norms 
or ideal states, it is then sometimes possible to obtain quantitative 
measures of the degree of deviation from these norms. Several ex- 
amples of this type of investigation were mentioned in the first 
section of this chapter. In order to reveal degree of deviation from 
a norm, the study must be designed so that the norms and the coded 
materials are stated in equivalent units. The study of majority and 
minority Americans in magazine fiction made by BereJson and 
Salter (7) illustrates one way in which such comparisons may be 
accomplished. These investigators took as a norm the proportion 
of the total American population represented by various minority 
groups and then calculated similar proportions for tlie appearance 
of various minority groups in a certain fictional population. Discrep- 
ancies could (hen be stated quantitatively. 

' In some studies of this type, the norms are slated in terms of 
some ideal pattern of the coded items. Thus, in an interview, an 
index of “information about world affairs” might be defined as 
the percentage of "correct" statements made in answer to certain 
questions. Here, the ideal state would presumably be 100 percent. 
Subgroups of the population could then be compared in the amount 
they deviate, on the average, from this ideal. A somewhat similar 
approach is illustrated by the coefficient of imbalance developed 
by Janis and Fadner (23), This coefficient results in a value of zero 
if the number of favorable statements equals the number of un- 
favorable ones. Furthermore, a quantitative measurement of devi- 
ation from such balance is given by use of the coefficient. If bal- 
anced presentation were set up as an ideal (as is often done for 
mass media in treating controversial subjects), this coefficient could 
then serve to measure how closely any given producer of content 
conforms to'the ideal. 

One ultimate objective of social-science research is, of cours?. 
the discovery of causal relations. The fundamental problems in 
constructing and using any research method bear directly or indi- 
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recUy upon this objective We shall consider here only a few of 
the special problems involved in using contenif analysis in deternnn 
ing causation 

The covariation of two attributes is commonly taken to suggest 
that there might be a causal relation between them When symbolic 
material has been analyzed in a quantitative fashion, it lends itself 
to this type of study It should be noted that for this purpose the 
two variables need not necessarily be expressed in the same units 
It IS possible to assert, for example, that a reduction of income 
reduces personal optimism without measuring the two things in 
similar units 


For many purposes a causal analysis is undertaken by looking 
for covariation of attributes within the same body of content The 
contiguity of certain themes in written material svas taken by 
Baldwin (3) as evidence for a functional interdependence Tins 
method IS quite common in the analysis of interviews in sample 
surveys It may be illustrated by an unpublished study conducted 
during World War II in which a substantial correlation was shown 
o exist between expressions of internationalism and statements 
nor, “lb"® This finding was taken to sup 

fnflneJc!^ 'P™'''") '''' hypothesis that a basic ideology 

influenced speafic amiudes 

possible (o demonstrate covariation between 
mrlev! ,h . “c ^^"“hle In interview 

as renonpil “ '1'"" common if we assume that such things 

reflect "ext °^'i 'ncome. and the like correctly 

ZmtLZTl P°h»cal predtsposition 

rehmon soc Bcrelson, and Gaudet (30) from reported 

prS The ■''“'fates this ap 

three charan showed that rertain combinations of these 

d^se attitr? oppeared sign.ficantly to pre 

and opinion sn political candidales Nearly all attitude 

“alysts in one 

be noted approach to discovering causation should also 

um« aWe assumption here is .hat the analyst ts some 

conmnr In n, fom the naiure of the 

reasons for surveys, respondents are often asked to state 

reasons for their attitudes or behavior The coding of such reasons 
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may reveal something o£ the nature of causation. Although there is 
ample evidence that people may not have correct insight into the 
determinants of their behavior, this important source of informa- 
tion should not be ignored. 

In the ideal approach to the problem of determining causation, 
a variety of techniques will be used. It is often possible to combine 
in the same study the two approaches just mentioned. For example, 
in studying the war-bond program of the United States government 
during World War 11, Cartwright (13) asked respondents to report 
why they had bought bonds during a specific drive. In addition, he 
examined the relationships between reported bond buying and re- 
ports of what happened to respondents during the drive. Some 
people reported that they bought bonds because they had been 
personally solicited. Analysis of the interviews showed also that 
people who reported having been personally solicited were much 
more likely to have bought bonds than people who did not report 
a solicitation. The agreement between these two types of evidence 
heightens one’s confidence in the conclusion that personal solicita- 
tion ^vas a causal determinant of bond buying. , 

Under the most favorable conditions, causation will b‘e deter- 
mined by independent manipulation and measurement of the inde- 
pendent variables. The study of demoralization under bombing, 
cited above (46), approximates this design. Here, quantitative indi- 
cations of demoralization in written letters were related to intensity 
of bombing (measured in tonnage; dropped) in the community where 
the letter writer lived. This example shows, incidentally, that 
manipulation of the independent variable need not necessarily be 
performed by the researcher himself. 


The Problem of Significance 

One of the most serious criticisms that can be made of much 
of the research employing content analysis is that the "findings” 
have no dear significance for cither theory or practice. In reviewing 
the work in this field, one is struck by the number of studies which 
apparently have been guided by a sheer fascination with iouniing. 
Unfortunately, it is possible for a content analysis to meet all the 
requirements of objectivity and quantification enumerated above 
without making any appreciable contribution to theory or practice. 
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It is an all too common error to equate “scientific” with “reliable 
and quantitative.” Unless the findings of a content analysis have 
implications for some theory, however vaguely formulated, the study 
can merit serious attention only on the highly tenuous claim that 
some day the significance of the findings will become apparent. 

For this reason, significant content analysis begins with some 
systematic problem whose solution will be determined by the specific 
nature of the data resulting from the analysis. This problem may 
stem either from a desire to extend a theory or conceptual model 
to some new realm of phenomena or from a need to predict or 
control events for some practical end. In either case the investigator 
must have an a priori conception of the variables that are relevant 
'P' of ‘he conient analysis is to indicate 

1 e preKnee or absence of these variables in the "real world," some- 

Ibou. 1 o' ‘he variables, and something 

about the relations among different variables. 
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interviews would have been futile for the purposes of the study. 

Another way of stating the requirement for a significant content 
analysis is to assert that the variables of the analysis outline must 
yield a genotypic rather than phenotypic description of the mate- 
rial. If the content is classified simply according to its superficial 
similarities and differences, little will be learned of relevance to 
“pure” theory or ‘‘practical” application. 

To insist that the variables of the analysis outline must yield 
genotypic descriptions does not imply, however, that the coder must 
always place the content directly into genotypic categories. Even if 
a 'variable is designed to reflect, for example, an attitude toward 
Negroes, the coder need not necessarily rate responses on an attitude 
scale ranging, let us say, from “very favorable” to “very unfavor- 
able.” It is possible to employ the attitude as a variable and still 
have the coder categorize responses in more phenotypic terms 
(perhaps by noting the presence or absence of certain stereotypic 
characterizations of Negroes). If this latter procedure is employed, 
it is necessary for the analyst to have some explicit procedure for 
placing these phenotypic indicators upon the attitude scale. Which 
of these two procedures will result in more reliable and valid classi- 
fication will depend largely upon the skill and sophistication of the 
coder. Usually it is easier to obtain good reliabilities with more 
phenotypic categories. There is also an advantage in being able to 
list explicitly the indicators used in rating attitudes. When the 
content dealt with, however, is complex and subtle, it is often found 
more economical to employ sophisticated coders who are able to 
intfjprer dJrectJy she genotypSc significance of the material. 


T/ie Problem of Generalization 

As a rule the content analyst is not interested in limiting his 
conclusions or findings strictly to the content actually analyzed. 
Almost invariably he undertakes his spedfic analysis in order to 
reveal something about a more general universe of data than just 
those symbolic materials (produced at a certain place and time) 
with which he deals. However, generalizations from a limited set 
of data to a more inclusive universe cannot be made legitimately 
unless certain conditions are met and certain procedures foliowcd. 

In considering the problem of generalization, it is convenient 
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to distinguish two rather difFeient types of inference which may be 
involved in the process. The first type rests upon the assumption 
that the materials analyzed are a representative sample of some 
specified universe of (actual or potential) materials. The need to make 
this kind of inference derives from practical considerations of con- 
ducting research. Money and time will be saved if the description 
of a small sample can be taken as a safe description of the complete 
universe. The second type of inference rests upon the assumption 
that the "discovered" Telalions between certain conditions and cer- 
tain consequences are universally true. In this type of generalization 
it is asserted that whenever and wherever the specified conditiohs 
obtain, there will follow the specified consequences, and no assump- 
tion need be made that the quantitative incidence of certain condi- 
tions found in the sample will also be found in the universe. 


assuring the representativeness of a sample. In principle, 
a satisfactory system for sampling materials in a content analysis will 
consist of four elements: (I) specification of the universe to which 
generalizations are to be made, (2) a guarantee that every unit of 
the universe has a known probability of inclusion in the sample, 
(S) a procedure of sampling which is independent of correlations 
among units of the univene, and (4) a large enough sample to 
provide a sufficiently small random error of sampling. 

The theory and practice of sampling have been extensively 
developed in recent yean, and the reader is referred to Chapter 5 
for a systematic discussion of the general problem. Here we shall 
limit our discussion to some of the more special problems encoun- 
tered in applying sampling theory to content analysis. 

onsider, first, the problem of specifying the universe of sym- 
bolic materials to which generalizations will be made. In any given 
study, the universe that should be selected will depend upon the 
purposes ole investigation. If, for example, the purpose is to 
^ t content of a given newspaper against some 

ar so t at legal action may be taken (as in determining the 
xis me o native fascist papers during World War II). the universe 
un er consi eration should be all editorial content appearing in 
issues o t at one newspaper over a certain period of lime. In 
IS case t e problem of sampling is to guarantee that the selected 
specimens of content accurately represent the total output of the 
neisspaper. If, however, the purpose of the study is to compare 
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national cultures through analyzing magazine hction of each, then 
the universe might well be all fiction in all magazines appearing 
in the country during a certain time. Now the sampling problem 
consists of selecting a representative sample of magazines as well as 
a representative sample of material from each magazine. 

This latter example may serve to illustrate one particularly 
diflicult problem in some kinds of content analysis. If magazine 
fiction is to be used as a "reflection” of a nation’s culture, should 
all fiction in all magazines be given equal weight? Should the arti- 
cles be weighted in some fashion to reflect the number of readers 
of the magazine? Should articles in a given magazine be weighted 
according to their placement in the magazine, their length, etc.? 
These questions suggest alternative ways in which the universe of 
the study might be defined. Any of the procedures might be tech- 
nically feasible. The choice among them should be governed by 
the conceptual scheme guiding the research, including, in this 
illustration, such things as the conceptual definition of "culture.” 
Is "culture," for example, to be defined only in terms of "symbolic 
produrts" or must its description and measurement take into ac- 
count the number and characteristics of people who are in contact 
with these products? If he is to be able to justify generalizations 
from his analyzed content, the investigator must be able to state the 
rationale for using a given universe of content and to define that 
universe precisely. 

After the universe has been selected for a given investigation, 
proper procedures for drawing a sample of that universe must be 
employed. Each unit of the univenc must have a known probability 
of inclusion in the sample, and the procedure of selecting units must 
be independent of any correlations among the units of the universe. 
These requirements apply both to the selection of sources, if the 
universe contains more than one source (producer of content, such 
as respondents in a survey or newspapers, etc.), and to the selection 
of content from any one source. 

Let us illustrate some of the dangers that arise because of cor- 
relations among units of the universe. Suppose that we have selected 
as our source a single newspaper and that we are going to sample 
content from it. Certain procedures that we might employ would 
produce biased samples, even though we guaranteed that every issue 
of the newspaper had an equal probability of inclusion in the 



452 The Analysis of Data 


sample. This possibility may be dramatized by an extreme instance. 
Assume that we order the issues as they appeared in time and that 
we sample one out of every seven issues of the paper. Suppose that 
by some chance procedure we happen to select Sunday as the starting 
point. Nottf our entire sample will consist of issues appearing on 
Sunday, and we shall have a disproportionately large incidence of 
features which appear only on Sundays. Obviously, this would be a 
bad sample of the total output of the paper. 

Many kinds of orderly fluctuations may be found in the content 
from a given source. Mintz (39) has described three major types 
and has investigated some of the problems of sampling associated 
with each. The first of these he calls “primary trends.” The ne^vs- 
paper treatment of some topic will often show a gradual build-up 
over a period of days. If the procedure of sampling happens to select 
disproportionately from the beginning or ending phases of this 
trend, estimates of the amount of space devoted to the topic will 
be correspondingly too small or too large. It is clear, of course, that 
such trends are not abvays linear. The second type of orderly fluctu* 
ation is a “cyclical trend.” An illustration of this kind is the weekly 
schedule of certain topics in a newspaper, or the regular scheduling 
of certain types of radio programs at certain times of the day. If 
these peaks are either over- or undersampled, corresponding 
errors in estunating the total universe will result. The third type 
of orderly fluctuation is where there are compensatory relations 
between adjoining units. Take, for example, a newspaper in which 
there is a tendency to give a continuing news story a big play on 
the first day but little space on the next day. If the sampling pro- 
cedure were to select issues of the paper appearing on alternate 
da^, there might result systematic errors in estimating the total 
univene. All these dangers can be minimized by following proce- 
dures in which the selection of each samplin? unit is independent 
of the other. ^ ^ ^ 

^ In passing, it may be noted that under certain conditions there 
will be an adrantage in stratifying the universe. That is to say, 
whenever there is reason to believe that certain classes of units may 
be more homogeneous than the total universe, these classes may be 
designated and the requirement set up that the sample contain a 
proper proportion of units from each class. In these circumstances, 
the selection of each unit should, of course, still be done in a random 
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fashion. An illustration of tins procedure would be one in which 
the requirement was set up that one seventh of the sample of news- 
papers come from each day of the week. Then, if proper methods 
are employed in sampling the newspapers for each day, one may be 
sure that each day will be properly represented in the total sample. 

estabushinc universal propositions about relations between 
CONDITIONS AND CONSEQUENCES. Thc ideal goal of the social psy- 
chologist is that he be able to construct universally true statements 
about relationships among variables. Although his working level of 
aspiration is ordinarily more modest, his research should neverthe- 
less be designed so that he may approach this ideal. The problems 
involved in establishing sdentific laws are general matters having 
to do with concept development, hypothesis formation, research 
design, etc., and cannot be discussed fully in this connection. Our 
earlier discussions of the problems of determining causation in 
content analysis and of producing significant findings treat some of 
the more important considerations especially related to the analysis 
of qualitative materials. Once a universal proposition has been 
tentatively formulated, the research task becomes that of replicating 
the study, seeking limiting conditions, and analyzing apparently 
exceptional" cases. 

It may be useful at this point to illustrate the difference between 
two major types of generalization by means of a specific example. 
Lazarsfeld, Berelson, and Gaudet (30), in their study of voting be- 
havior in Erie County, Ohio, in 1940, found that certain factors 
such as religion, sodoeconomic status, and rural or urban residence 
predisposed a voter to cast his ballot for one party rather than 
another. This "finding" was based upon a sample of interviews in 
the county. A generalization of this finding to all residents of Erie 
County rests upon the assumption that the sample employed was 
representative of the entire county. Since a single county cannot 
he taken to be a representative sample of all counties in the United 
States, no safe generalization of this finding can be made to the 
country as a whole. Furthermore, any generalization to future or 
past elections with the same county cannot be made safely without 
further evidence that those conditions producing this predisposition 
femain constant over time. 

In this study another type of "finding” was also produced. It 
"as discovered that people svho were subjected simultaneously to 
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conflicting predisposing factors (such as a rural Catholic or a low- 
income Protestant) displayed various symptoms of conflict in making 
a political decision. For example, they took longer to make up their 
minds and showed more vacillation in their party preference. The 
theory that conflicting forces or cross-pressures produce such symp- 
toms of conflict may be proposed as a universal “law” which should 
hold wherever or however such cross-pressures are exerted. The 
truth or falsity of this theory docs not depend upon the representa- 
tiveness of the sample employed in the study; any valid exception 
to it found anywhere would be sufficient to require a modification 
of the proposition. 


CONSTRUCTING THE ANALYSIS OUTLINE 

The preceding discussion has examined the major systematic 
principles involved in converting phenomena into scientific data. 
In conducting research, more is needed, however, than an under- 
standing of these fundamental principles. The success of any project 
will depend upon the degree to which these principles are expressed 
as actual procedures. Let us turn, then, to a consideration of some 
of the more concrete and detailed procedures involved in carrying 
out investigations that employ content analysis. 

How, specifically, does one go about constructing an analysis 
outline? Six steps for arriving at a satisfactory analysis outline may 
be indicated. These steps are intended to suggest clusters of inter- 
related decisions which the analyst must make. They are points at 
which it is useful to check the emerging outline against the general 
principles listed above. 

Step I. Specify Needed Data 

In laying out an analysis outline, it is essential that the investi- 
gator have clearly in mind specifically what data are required by 
his total research design. Ordinarily, he will encounter less difficulty 
in the long run if he is able at this point to work out his plans in 
sufficient detail so that he can tell what form his final tables will 
take. If, for example, his design calls for testing in a set of inter- 
views the relationships between information about international 



Analysis of Qualilallve Material 455 


affairs and attitude toward the United Nations, the analyst must 
specify before he constructs his outline just what data he will take 
to test this relation. He might decide that he will want to present 
in his report a matrix in which the columns indicate several posi* 
tions on an attitude scale and the rows show different scores on an 
information test. He may tvant to present the frequency of inter- 
views falling in each cell and test whether the distribution differs 
significantly from a random one. A similar specification of needed 
data should be made for the entire investigation. 


Step 2. Map Out Plans for Tabulation 

A great deal of trouble can be avoided by making explicit plans 
for the tabulation of coded data before constructing the analysis 
outline. It makes a good deal of difference, for example, whether 
the coded data are to be punched on cards for machine processing 
or to be tabulated by hand. Although the variables and the cate- 
gories of the outline will not usually be different for different 
methods of tabulation, their arrangement within the outline and 
the system of notation employed in coding may well be quite dif- 
ferent. Since tabulation by punch cards is accomplished by punching 
a numbered position in a numbered column, the appropriate nota- 
tions on coding sheets consist of indicating a variable by a column 
number (or numbers) and a category by a number within the 
column. Thus, “attitude toward the United Nations” might be as- 
signed to column “27.” In this column a favorable attitude might 
be given the number “I,” a neutral attitude number “2,” and an 
unfavorable attitude number ''5." Vv’hen there is insufficient evi- 
dence for making any rating, the number “0” might be noted in the 
same column. If tabulation is to be done by machine, it will be 
helpful to consult, at the time the analysis outline is being con- 
structed, a person experienced in inadiine tabulations so that the 
many “tricks of the trade” and short-cuts which are possible can be 
built into the analysis outline. 


Step 3. iMy Out the Skeleton of the Outline 

At this point, it will be useful to list the variables in terms of 
which the content is to be coded. If the investigation consists in 
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analyzing interviews, these variables will be used to classify not only 
vanous features of the answers to questions about the respondent s 
psychological make up but also such matters as his age, income, man 
tal status, and other demographic or behavioral characteristics In list 
ing the variables to be included in the outline, care should be taken 
to assure that all information needed on the punch cards is placed 
on some variable Thus the outline should contain provision for 
coding the name of the study, the number of each enumeration unit 
(interview, issue of a newspaper, etc.), the name of each coder, and 
any information relevant to tests of reliability or other statistical 
treatment 

In the literature on content analysis of communication mate* 
rials, certain types of variables have been employed rather fre- 
quently These have been summarized by Berelson (6) under the tivo 
broad headings of ‘What is said and How it is said ‘ The van 
ables listed by him under each arc given here, in order to indicate 
some of the kinds of variables that one might profitably employ 
A What IS SAID 

1 Subject maueT'-what is the communication about 

2 Directjon-is the treatment favorable or unfavorable to- 
ward the subject? 

3 Standard— what u the basis (or grounds) on which the 
classification of direction is made? 

4 Values— what goals are explicitly or impliatly revealed? 

5 Methods— what means or actions are employed to realize 
goals? 

6 Traits— what characteristics of persons are revealed? 

7 Actor— who initiates actions? 

8 Authority— in whose name are statements made? 

9 Origin— what is the place of origin of the communication? 

10 Target— to whom is the communication particularly 

directed? 

B How IT IS SAID 

1 Form of communicaiion—is it fiction news television etc? 

2 Fonn of statement- what is the grammaucal or syntactical 
form of the unit of analysis? 

3 Intensity- how much strength or excitement value does 
the communication have? 
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4. Device— what is the rhetorical or propagandtst/c character 
of the communication? 

Step 4, Fill in Categories for Each Variable 

There are many systems of categories which may be employed 
for any given variable. The one chosen will depend upon the ob- 
jectives of the study and the type of measurement being under- 
taken. Whatever type of system is chosen, the analyst should check 
to see that it meets what Lazarsfeld and Barton (29) call "the require- 
ment of logical correctness," A system of categories meets this 
requirement if it is exhaustive and if its categories are mutually 
exclusive. It is exhaustive if there is a category in which to place 
every relevant item which may be found in the content. Its cate- 
gories are mutually exclusive if there is one and only one place to 
put an item within that system of categories. Although this require- 
ment of logical correctness appears simple and obvious, it is remark- 
able bow frequently it is violated. Experience indicates that it will 
be well to check each system of categories before they are hnally 
used, to be sure that it is satisfactory in this respect. Systems of 
categories which call for listings of themes, reasons, arguments, 
sources of influence, and the like seem especially vulnerable to this 
type of error. The following classifleation of places where people 
were solicited to buy war bonds is not a far-fetched example: place 
of work, home, store, bank, post office. Now, this system of categories 
w neither exhaustive nor mutually exclusive. Obviously, there are 
other places where solicitation might take place— and where would 
one categorize a farmer who was solicited at home on his farm? 

In constructing categories, one is often confronted with a di- 
lemma: If a category is too broad, it conveys little specific meaning, 
but if it is too narrow, the coded material differs little from the 
“raw" material. One resolution of this dilemma is through use of 
grouped categories. Thus, a system of cjtegories for classifying 
“reasons for buying war bonds" might designate such broad cate- 
gories as “Personal financial," “National patriotic," “National eco- 
nomic," etc Then, under each heading there might be more specific 
categories, such as “Bonds are safe investment,” “Bonds pay good 
*^te of interest,” ‘’Money invested in bonds is exempt from tempta- 
tions of spending," etc In the interpretation of findings from the 
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study, the investigator may utilize eadi level of classification for 

different purposes. . c . • 

If the analysis outline contains a considerable number of vari- 
ables, it is likely that rather similar systems of categories will be 
found among these variables. In the interest of coding speed and 
the reduction of errors, it has been found desirable to establish 
certain consistencies in the way in which the categories are arranged. 
The University of Michigan Survey Research Center (44) has estab- 
lished certain conventions which it follows in categorizing free- 
answer interviews. For example, the category “yes’* is always given 
the code number “I,” the category “no” number “5,” “don’t know 
number “9,” “none" number “0," etc In a similar way, it has been 
found desirable to standardize the numbering system for scales so 
that they all progress in the same direction from positive to negative 
or from high to low. With such standardization, the coder can soon 
categorize material almost automatically. 

When the analysis outline has been completed, with all the 
categories defined, a manual of instructions for coders should be 
written giving these definitions in clear operational terms. 

Step 5. Procedure for Unitizing the Material 

We have defined above three kinds of units which must be dealt 
with in any content analysis: the recording unit, the context unit, 
and the unit of enumeration. The specific working definitions to be 
used in the study should be established at this point in such a 
fashion that various coders can all unitize the same material in the 
same way. These definitions should be written down as a part of 
the coding instruaions. The selection of definitions of these units 
should be guided by the same theoretical framework that determines 
the rest of the research design. "Practicar’ considerations of coding 
efficiency and reliability should not be ignored in deciding such 
things as the “size" of unit, but valid coding depends upon the 
theoretically correct selection of units svhose categorization can 
properly be taken to indicate some 'significant feature of the 
material. 

The most common recording units in communication research 
are (1) a single word, (2) a theme, usually consisting of a subject 
and predicate or some larger unit which can be condensed into a 
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single assertion, (3) the smallest segment of content required to 
yield a single characterization, such as an adjectival phrase, value 
judgment, and the like, (4) a character, person, group, or institution 
that is described in the content, (5) a paragraph or other natural 
unit of meaning, and (6) an item such as an article, speech, radio 


program, etc 

In analyzing free-anstver intcrvietvs, the most frequently used 
recording unit is the answer to a single question. It is not uncom- 
mon, however, to use larger or smaller units. For certain purposes 
an entire interview may be taken as a single unit and characterized 
as a whole. For other purposes a certain set of questions may be 
treated as one unit. Or there may be good reason to break down the 
answer to a single question into units consisting of single words, 
themes, value judgments, or reasons. 

Designation of the context unit is often left quite vape or to 
the individual coder’s judgment. Since the major purpose in setting 
up a context unit larger than the reoirding unit is to provide better 
bases for perceiving the “meaning" of the recording unit, there 
seems to be some justification in allowing the coder to seek clanfi- 
cation throughout the material. Such a procedure, however, some- 
times greatly reduces the reliability of coding. Whenever it is pos- 
sible, the coder should be given quite specific instructions so"»e^Wiai 
like the following; “Read the answers to questions 2. 3, and 4 before 
categorizing the reasons given in question 5, but do not rea t e 
answers to questions coming later in the interview , or ^ Rea an 
entire paragraph, but no more, before coding the value ju gments 


within the paragraph." . . 

The unit o£ enumeration that seems to be most popular in 
communication research is that of physical length (sue ^ 
inch, etc.) or temporal duration. If such units are meaningful from 
a theoretical point of viesv. they should be used, because they have 
real advantages of reliability and susceptibility to rnathemat ra 
manipulations. In interview- surveys, the most commonly.nsed un 
of enumeration is the respondent. This, too, is a ■ 

because it is ordinarily safe to consider each respon en ^ 
tadvely equal to every other. If there is some theoretical ea on for 
not trLting each r^ondent equally, other units 
may be required. For example, because of 
pendence of several people who are supporte 
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same source, it is desirable in some kinds of economic surveys to 
employ a ' spending unit ' as the unit of enumeration Information 
may be obtained separately from each individual, but it would be 
pooled into one “spending unit” in order to construct a single unit 
for pi rposes of computing frequencies, means, and distributions 


Step 6 Try Out the Analysis Outline and Unitizing Procedure 

After the analysis outline and the unitizing procedure haie 
been developed, they must be applied to the content in a prelim 
inary way in order to discover what modifications are needed Ordi 
nanly, this trying out of the coding procedures is also used as a 
training period for those people who are to do the final coding 
When this period is ended, the analysis outline should be fixed m 
us final form and the coders should be set ’ in their use of the 
coding procedures 

Thu stage has been standardised at the University of Michigan 
urvey esear^ Center (44) in a procedure known as the ' Round 
of materials is collected, and each coder codes 
epen ent y All disagreements among coders are noted and 
used as a preliminary check on the reliability of coding These dis 
a^eements are also examined to see what improvements in the 
should be made It i$ not uncommon to make sub 
Variables of the analysis outline 
"material well will need to be redefined or 
or nnt ysicms oE categories which are either not exhaustive 
Robin ^ ^ detected and revised if the Round 

exbanriprt Those variables made up of listings can be 

duction additions to the list will be made after “pro- 

on the ^ And, finally, the system of notation 

convemeni f ^ j checked to determine whether it is most 

tate tabiil coding and whether it will facili 

tate tabulation as much as possible 

cedure sh” Robin is completed the whole coding pro 

codrf in ^ “f b' 

after final ® Any modifications of the analysis outline 

rials rodeH begun must be made retroactive to all mate 

snent in m change Obviously, much time could be 

spent m making such changes if they were to occur very often 
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USING THE ANALYSIS OUTLINE 

If the content analyst has skillfully taken the steps indicated 
above, he should now have an analysis outline well suited to his 
research objectives, appropriate to the content at hand, and amena- 
ble to efBcient tabulation and statistical treatment. The remaining 
requirement is that he have coders who are able to use the analysis 
outline as intended and in a standardized manner.^ It is useful to 
think of the coder as a measuring instrument which must be sensi- 
tive to variations in the material and dependable in the sense that 
it responds in the same way to functionally equivalent content. In 
order to have coders who possess these characteristics, it is necessary 
to select people with the proper abilities, to train them adequately, 
and to supervise their work effectively. 


Selection of Coders 

For satisfactory coding, certain skills and abilities are essential. 
The coder must be a sensitive person, well differentiated in respect 
to symbolic materials. He must be able to detect subtle differences 
of meaning but also to neglect differences that do not make a dif- 
ference for a specific purpose. In other words, he must be able to 
make use of the genotypic categories required by the analysis out- 
line. In most social-psychological research, this means that the coder 
must have some acquaintance with the concepts of social psychology. 
If the analysis outline requires only phenotypic categories or cate- 
gories defined in terms of everyday usage, the coder may well be 
3n intelligent layman. A reasonably good level of intelligence is the 
minimal requirement for any content analysis. 

If the quantity of material to be coded is great, an additional 
requirement must be met. The process of coding involves the re- 
petitive application of the analysis outline to the material. Reliable 


»TJm discussion of how to use iheanalj^is outline is wrmcn 
“on that several people svill do ihc actual coding. If the volume of 

analyzed is 1^. this assumption i* r«listic ^ven. how^c^ when the 
analyst and the coder are the same person, it is ^ anal^ forces 

so that the whole procedure can be 

himself to communicate his definitions of categories, uni s. rrorodurible 

he can have little assurance that his procedures are in fact rcproduable. 
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coding demands, therefore, that the outline be used in the same 
way (the same operational definition of categories, the same frame 
of reference, the same degree of differentiation, the same level of 
attention to details, etc.) throughout the entire coding operation. 
A person who is easily satiated with repetitive work will conse- 
quently not make a good coder for a very long period of time. 
Studies of satiation by Karsten (24), Kounin (25), and others have 
shown that the very requirements of sensitivity, motivation, and 
deep involvement in the task tend to hasten satiation if the meaning 
of the task is that of sheer repetition. Such satiation produces errors 
and variability in the application of the analysis outline to the mate- 
rials. Unfortunately, since no good test of susceptibility to satiation 
has been developed, there is little that can be done at present to 
minimize this problem through di^erential selection of coders. It 
would appear, however, that people who view the task of coding 
as routine, menial work will view it as mere repetition of the “same” 
activity and will, as a result, be satiated more readily. 

If the coding is to be carried out by a team of coders, it is neces- 
sary that they all come to apply the same definitions and frame of 
reference to the coding. The achievement of such a common ap- 
proach to the task is best accomplished through group discussions, 
the Round Robin, and the sharing of difficult coding decisions. A 
coder who is uncommunicative or ego-defensive will, therefore, not 
contribute well to this purpose and will probably heighten the 
unreliability of coding. Again, it must be pointed out that no very 
satisfactory objective test of these personality traits now exists, and 
that selection for these traits is difficult. 

In present practice it appears that good coders aie discovered 
mainly through a process of selection “under fire,*' and that some 
provision might well be made to begin with a somewhat larger staff 
of coders than will “survive” to the end of the project. 

Large research organizations who maintain a permanent coding 
staff have found it difficult to maintain the same people over a period 
of years at a high level of morale. Sensitive and intelligent people 
who are acquainted with the concepts of social science rarely find it 
satisfying to make a life career of such repetitive and routine work. 
Rarely can such a person work full time at such a job for more than 
a year or two without considerable demoralization. Much better 
morale seems to result when the arrangement is for part-time or 
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periodic coding, and when the task can provide satisfaction other 
than merely economic gain. ‘College students who combine the 
financial incentive with a larger goal of training or social service 
and who do coding as a part-time occupation appear ideally suited 
to this kind of task. 


Training of Coders 


Once the coders have been assembled for a given project, it is 
necessary to train them in the use of the analysis outline. In general, 
it is desirable to communicate to the coders a full understanding 
of the purposes of the project— why it is being done, what uses will 
be made of the findings, and any other motivations which actirate 
the project director in undertaking the study. A full understanding 
of these matters on the part of the coders will allow them to do their 
work more intelligently and with a higher level of motivation. 
Unless these matter are communicated effectively, many decisions 
made by the project director will appear meaningless and arbitrary 
to the coders. There may be, upon occasion, specific hypotheses 
which should be kept from the coders for fear that such infonnation 
might “contaminate” the coding, but decisions to withhold mforma- 
tion from the coders should always be made only when other pro- 
cedures cannot be followed equally well. Coders who better under- 
stand how the analysis outline was constructed will be better able 
to adopt the rationale behind the operational definitions of cate- 


gories and units. 

After the general purposes of the project have been communi- 
cated, instruction in the details of the outline may begin. ^ P , 
pose of this training is to establish a common frame of reference and 
common operational definitions among all the codem. It is well t 
begin this phase of training with oral and written 
the variables and categories. Then, after the 
have been grasped, the coders should begin trying out these def m 
tions on thi materials. At this stage it is toimble >° "love m.o he 
Round Robin. As mentioned above, the independent “ding o h, 
same materials by all the coders serves the f P-tp®' 
the analysis outline and of standardizing t c c f 

that the Round Robin be conducted sv.th enough “'n typ« o 
content and with sufficient discussion so that all major problems 
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might subsequently arise are worked out. A running record of coder 
disagreements is essential as an aid''in revising the outline and as 
an indication of when the Round Robin may be safely terminated. 
When, by actual performance, the coders have demonstrated their 
ability to code reliably in the way called for by the research design, 
“real” coding may finally begin. 


Mechanics of Coding 

The orderly processing of materials requires that regularized 
procedures be established for their storage, assignment to coders, 
and recording. It has been found convenient to package together a 
collection of materials (perhaps ten interviews or ten issues of a 
newspaper) and to have a coder “check out” a package at a time. 
When one package has been coded, it is returned to a central storage 
place and a new package is taken. In the assignment of materials to 
coders, it ts desirable to randomize the materials so that any sys* 
tematic biases among coders will not appear as trends or correlations 
in t e coded data. An orderly assignment of materials to coders will 
a so assure, of course, that all materials get coded once and only 
once. Similar care should be exercised in collecting and storing the 
sheets upon which coding has been recorded. 

the Round Robin has been completed and “final” 
found necessary to add new categories 
o t e variables of the outline. Or it may be discovered 
"oth °° are falling into such catch-all categories as 

rhoT-a • ® coder comes across a recording unit for which 

rAHin^ ^*fSory, he should bring this case to the attention of the 
a Tipw supervisor should assess the merit of adding 

hp TTiPnn; by deiej.jjjining whether the new category would 

volvpff K ” withm the rationale of the system of categories in- 
well hp ^1 whether the case in point could not just as 

npv, ®^*®**"g category and by judging whether the 

spnaratP ^uuld be used frequently enough to warrant its 
dpr;«*nn t outsWe OHC of the catch-all categories. If the 

m H ® the analysis outline, this change must be 

nr H ^ out me being used by all coders. Furthermore, some 
procedure must be established to guarantee that the coding of all 
prevtously coded materials is modified wherever required. ' 
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In using catch-all categories, it is sometimes useful to keep a 
separate hand tabulation of the different items that are thus coded. 
For example, suppose that a system of categories contains the cate- 
gory “other reasons.'* Each time a recording unit is placed in the 
category, a notation is made of the nature of the specific reason, 
along with an identification of the unit. Then, if it is discovered that 
some specific reason is appearing with a considerable frequenc)’, it 
may be separated from the “other reasons" category and tabulated 
by itself. 

As coding proceeds, it is important to hold periodic discussions 
among the coders to assure that the same frame of reference and 
operational definitions of categories are maintained throughout ^e 
coding period. The entire group should discuss any persisting dis- 
agreements in the use of certain categories, and other problems 
arising out of experience with the analysis outline. 

The reliability of coding can be measured and stability of 
coding promoted by use of “check coding. In this procedure a cer 
tain percentage of the content is independently recoded by a check 
coder.” The check coder may be someone who is taken as a sort of 
criterion (perhaps the principal investigator), or a system simi ^ to 
the Round Robin may be used in which each coda se^es « * 
coder for each of the others. After the check coda has 
set of material previously coded by one coder, the two 5^°“^ 
together to discuss each of their disagreements If the ™ 

carries an atmosphere conducive to learning rather than self-justifi- 
cation, this disaussion can serve well to improve the quality of 

coding as the study goes along. __ ,h„„iH 

Records of disagreements turned up in the check co g 
be kept and tabulate in various ways. These records of cou^^ mn 
be used as one type of measure of the touil rehab, ity of ^ ng 
They can also, however, be broken down r/luH be 

analyses. The various variables of the analysis 
examined separately to determine whether ” .--.-ai-ies may 

during abnoLally high unreliability. Each of *“7777 
also bl examined more minutely to It 

of disagreements were most common within a *'"8 distinauish 

be found, for example, that coders simply ran categories 

reliably between two adjacent categories an t a pinjUy the 

should be merged into one for further tabulation. Finally. 
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records should be tabulated separately for each coder in order to 
yield a measure of his reliability as a roding instrument. The inter* 
pretation of differences among coders should be made judiciously, 
of course, because it is possible that the coder with the greatest 
number of disagreements might be the most "valid” coder. 

The proportion of tlie material which should be check coded 
will depend upon the uses that arc to be made of it. For purposes 
of training and of maintenance of constant standards, there is some 
point in check coding a relatively larger proportion earlier in the 
coding process and to taper off as coding becomes stabilized. In order 
to construct a measure of reliability, it is best to employ a random 
sample of all materials. In some cases, where any error is deemed 
serious, it may be desirable to check code the entire set of material 
and to tabulate only the pooled judgment. 


SUMMARY 

The fundamental objective of all content analysis is to convert 
phenomena {U., symbolic behavior of people) into scientific data. 
We have specified four characteristics which scientific data must dis- 
play: (1) objectivity and reproducibility, (2) susceptibility to measure- 
ment and quantification, (3) significance for systematic theory, either 
pure ' or "applied," and (4) generalizability. 

In construaing an analysis outline for a given project, it will 
be useful to oi^nize the work so that it consists of six steps, or 
decisions. At each of these points the developing outline 
j ® checked against the formal requirements for scientific 

ata. These steps are: (1) specifying needed dau, (2) mapping out 
tabulation, (3) laying out the skeleton of the outline, (4) 
hlling in categories for each variable. (5) esublishing procedure for 
un tizing the material, (6) trying out the analysis outline and 
unitizing procedure on a sample of the material. 

^ e successful use of a well-developed outline depends upon the 
« ection of capable coders, effective training of them in the outline 
being used, and the establishment of good supervision so that proper 
procedures of coding are followed. 

Experience over a number of ^ars with content analysis reveals 
that when technically well executed it can be a most valuable loci 
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for the social scientist It should be vieived, however, only as a tool 
Even when it is extremely well fashioned its scientific or practical 
value may, in any specific project turn out to be negligible A sue 
cessful research project will combine both technical excellence and 
a good research design aimed at answering significant research 
questions 
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CHAPTER ELEVEN 


Theory and Methods of 
Social Measurement 

Clyde H. Coombs' 


What one "finds out" from one’s data is a function of two 
things: the information in the data and how this information is 
extracted. What information the data contain depends on how it is 
collected. Some methods of collecting data "permit” more <*arac- 
teristics of behavior to exhibit themselves than do other methods. 
Or, in opposite terms, some methods of collecting impose 

properties on the behavior that other methods do not. Obviously, 
properties imposed on the data by the method of observation cannot 
be inferred to be properties of the behavior in question. 

The method of analysis, then, defines what the information is 
and may” or may not endow this information with certain properties. 
A "strong” method of analysis endows the data with properties 
which permit the information in the data to be use^ , or examp ^ 
to construct a unidimensional scale. Obviously, ^in, sue a sm 
cannot be inferred to be a characteristic of the behavior in question 
if it is a necessary consequence of the method .. . 

It therefore becomes desirable to study methods of collecting 

11 Wish m thank Leon Fnslinger. 

fading the manuscript of this chapter and contr.ounng j 
criticisms and suggestions. 
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data with respect to the amount and kind o£ information each 
method contains about the behavior in question as distinct from 
that imposed. Simiiarly, it becomes desirable to study the various 
methods of analyzing data in teims of the characteristics or proper 
ties each method imposes on the information in the data as a 
necessary preliminary to extracting it. 

All of this is a part of measurement theory, a subject of greater 
concern in the social sciences than it is in many other domains of 
knowledge. Measurement in the physical sciences usually means the 
assigning of numbers to observations (a process called "mapping ). 
and the analysis of the data consists in manipulating or operating 
on these numbers. The social scientist, taking physics as his model, 
has, frequently, attempted to do the same. It is the thesis of this 
chapter that the social scientist who follows such a procedure will 
sometimes violate his dau. 

WHAT IS MEANT BY MEASUREMENT 

The objective of this first section is to show that the theory 
of measurement consists of a system of distinct theories, each cor- 
responding to what may be called a level of measurement, and that 
a given set of data may satbfy (permit the valid use of) some of these 
levels of measurement but not others. This first section will be 
concerned with an incomplete generalization of the logic of measure- 
ment, with examples at each of the levels discussed. This will be 
followed by a section on a theory of data in which the distinction 
between collecting and analyzing data will be discussed in terms of 
certain abstracted invariants of the behavior of individuals. This 
theory of data is an effort to cohstruct a framework in terms of which 
all methods of collecting and analyzing data may be unified under 
a general system. In the final two sections, methods of collecting 
and analyzing social-psychological data will be discussed in the 
' context of this general theory of measurement and the theory of 
data. 

Throughout this chapter, the major emphasis will not be on the 
application of specific techniques. This type of material is available 
throughout the literature. Instead, emphasis will be placed on the 
assumptions underlying the several techniques and their structure 
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and interrelations. An understanding of these aspects of the nature 
of measurement is necessary for an intelligent choice of a method of 
collecting data and of a method of analysis. The various methods of 
collecting data contain information which may differ both quanti- 
tatively and qualitatively. Similarly, the various methods of analyz- 
ing data may differ in the degree to which they rely on information 
contained in the data as contrasted with the structure and relation- 
ships they impose on the data. 

In the discussion to follow, some of the various levels of measure- 
ment within the general theory of measurement will be indicated. 
The nominal scale, which is the simplest possible level of measure- 
ment, will be discussed first.* 


The Nominal Scale 

Measurement in its simplest form consists of substituting sym- 
bols or names for real objects. When measurement consists only in 
this mapping of objects into symbols, the symbols constitute a 
nominal scale (29). Thus a system of occupational families or psy- 
chiatric classifications is an attempt to construct a nominal scale. A 
nominal scale has certain properties which may be formulated 
abstractly as axioms. For example, either the relation of "equal to 
or "not equal to" must hold between objects on a nominal scale. 
This means that any pair of objects must clearly belong to the same 
class or not belong to the same class. In addition, the relation of 
equality must be symmetric and transitive. By symmetry is meant 
that if the relation holds between a and b, it also holds between b 
and n: symbolically, ita=zb. then 6 = a- By transitivity is meant 
that ita=:b and b. = c, then a = c. 

This level of measurement is so primitive that it is not always 
recognized as measurement, but it is a necessary condition for all 
higher levels of measurement. , 

The psychological processes of perception are representative of 
measurement on a nominal scale. Perception may e regar ^ as 
the mapping of stimuli into cquts'alencc classes. The properties o 
>he "ideal" of an equivalence claB ihen become the properties of 
a specific object, and such phenomena as size constancy and a great 

SA mo,, detailed Eenemlimtion of the losic oi meMoremem theory i. 
Wntalned in Coombs (8. Chap. 1). 
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variety of other constancies are to be anticipated. The mappings of 
individuals into such classes as athletes, politicians, Negroes, etc., 
are examples of stereotyping and constitute nominal scales. Cultuta 
and educational factors affect the construction of such nominal 
classes, creating new ones, dismembering old ones, and creating 
relationships among classes. The relationship between nomina 
scales and perception is so dose that whether or not an individual 
perceives and what he perceives is dependent in the first place upon 
the existence o£ nominal classes and in the second place upon the 
range or spread of the classes. 


The Partially Ordered Scale 

Sometimes the objects in one class of a nominal scale are more 
than just different from the members of another class— they may bear 
some kind of a relationship to them. One such relationship is that the 
members of the one class are more of something than the members 
of the other class and it is meaningful to say that the relation 
"greater than" (», in some respect, holds between the members 
of one class and the raemben of the other. Given a number of 
equivalence classes, if such a relation holds between some pain of 
dasses, the result is a partially ordered scale. For example, suppose 
one wants to measure something to be called socioeconomic status. 
Let us also suppose, for the sake of simplicity of illustration, that 
this attribute is made up of income and educational level. If indi- 
vidual A has more income and at the same time more education than 
individual B, he can be said, then, to have a higher socioeconomic 
status than B, {A > ZJ). Further, if B has more of both constituent 
attributes than a third individual C, then not only is B > C, but 
also A has higher socioeconomic status than C, (A > C). It is 
apparent, then, that this relation is transitive. It is asymmetric, 
however,' because il A > B, then B > 

Suppose now that there were some fourth individual, D, who 
had less of both attributes than the first individual. A, and more of 
both attributes than the third individual, C. It could also be said, 
then, that, with respect to socioeconomic status, /I > D > C. But 
suppose at the same time that, although D has more income than 
B, he has less education. This poses a problem. It is not immediately 
dear whether B > D or D > B with respect to socioeconomic status 
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or, in fact, whether either of these relationships exists. If it is not 
insisted that one of the relationships 'above must necessarily hold, 
then this pair of individuals, B and Z>, may be spoken of as being 
incomparable, and the scale of socioeronomic status of A, B, C, and 


D constitutes a partial order of the form B Z)» where the con- 


necting line between two individuals signifite that the higher indi 
vidual has more status than the lower individual. Where there is 
no connecting line the two individuals are not comparable. Hence 
in this partial order it cannot be said that B has more smtus than D 
or vice versa. This level of measurement is called a partially ordered 
scale and may be obtained whenever an attribute is made up of two 
or more primitive attributes which do not combine additive y or 
compensate for each other. 


The Ordinal Scale 

Beginning with a nominal scale and finding a relation (e.g., >) 
to hold for some pairs, one can construct a partially ordered scale, 
a higher level of measurement. It one finds that the relauonsh.p 
holds for all pairs of objects from different classes, one can 
appropriate change in the axioms to convert a ^ 

scale into a simply ordered scale or what is usually known as the 

°'‘^‘LeVm rmurn to the question of socioeconomic 

what is required to construct an ordinal scale in t a • 

already have an ordinal scale there in ail respects 

Viduals B and D are not comparable. To ^ 

ail pairs must be comparable. » be comparable. 

must be "found” to be comparable .,tmLies to 

This can be done by equating one of the * .-nrome to one 

the other. An example would be to equate 5 ° ^ ^ 

year of education. This would be a simple linear 

any system would do. In the absence of any natural 

ismadeconcerningtheseequiv«tvh.chm^^ 

to every unit of the other. Immeoiate y, individuals or 

simple operational basis for An^oi^inal scale has thus 

for placing them in equivalence cla 
been constructed. 
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variety of other constancies are to be anticipated. The mappings of 
individuals into such classes as athletes, politicians, Negroes, etc., 
arc examples of stereotyping and constitute nominal scales. Cultural 
and educational factors affect the construction of such nominal 
classes, creating new ones, dismembering old ones, and creating 
relationships among classes. The relationship between notninal 
scales and perception is so close that whether or not an individual 
perceives and what he perceives is dependent in the first place upon 
the existence o£ nominal classes and in the second place upon the 
range or spread of the classes. 


The Partially Ordered Scale 


Sometimes the objects in one class of a nominal scale are more 
than just different from the members of another class— they may bear 
some kind of a relationship to them. One such relationship is that the 
members of the one class are more of something than the members 
of the other class and it is meaningful to say that the relation 
"greater than” (», in some respect, holds between the members 
of one class and the members of the other. Given a number of 
equivalence classes, if such a relation holds between some pairs of 
classes, the result is a partially ordered scale. For example, suppose 
one wants to measure something to be called socioeconomic status. 
Let us also suppose, for the sake of simplicity of illustration, that 
this attribute is made up of income and educational level. If indi* 
vidual A has more income and at the same time more education than 
individual B, he can be said, then, to have a higher socioeconomic 
status than B, (/i > B). Further, if B has more of both constituent 
attributes than a third individual C, then not only is 5 > G, but 
also A has higher socioeconomic status than C, (A > C). It is 
apparent, then, that this relation is transitive. It is asymmetric, 
however, because if ^ > fi, then B > ,4. 

Suppose now that there were some fourth individual, D, who 


had less of both attributes than the first individual. A, and more of 
both attributes than the third individual, C. It could also be said, 
then, that, with respect to socioeconomic status, /4 > X) > C. But 
suppose at the same time that, although D has more income than 
B, he has less education. This poses a problem. It is not immediately 
clear whether B > D or D > B with respect to socioeconomic status 
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to nonmetric systems. The determination of the dimensionality of 
a partial order (corresponding in principle to the rank of a correla- 
tion matrix) already appears possible, but routine procedures for 
decomposing a partial order into alternative sets of simple orders 
(corresponding in principle to the rotational problem o actor 
analysis) have still to be developed. 

A “natural" ordinal scale is obtained when the raw data them- 
selves contain the relation "greater than for all pairs. It is remar 
ably difficult to find an example of a simple order among socia - 
psychological variables, since partial orders are almost 
Let us suppose, however, that sve wish to measure the autho y - 
of some military personnel of v.ariotis ranks and t at ms 
deriving the scale from military regulations we shall derive i 
observational data. Let us take "authority to naeati w o 
whom" and represent this by the familiar —.nds 

If individual B commands a platoon and individud A commands 
the company svhich includes that platoon, t en - , j 

means a^imple order along a chain of “"-’“"trn ofan 000!- 
There is some difficulty, however, about the 'definition 
alence class corresponding to a specific ordma po ' 
at the level of the nominal scale. Consider 

each commanding a separate group ft j ,he mem- 

one correspondence between the merobe ® interchangeable 

bers of the other such that the men -fP? ;'’“;J^Vme?and hence 
then C and D can be regarded as „„t interchangeable 

members of an equivalence class. It 1 hpcomes a partial 

within pairs, however, the authority struc simplicity, 

order again. Let us imagine.- ah,c We have then a 

that corresponding elements are interchangeable. 

simple order of "aulhoriiy.” 


The Ordered Metric Scale . 

j- I heretofore— nominal, partially 

In the three scales discussed system were classes of 

ordered, and ordinal-the elements o equality and 

objects and the relationships has been said 

greater than. It should be observe (omial introdiic- 

about a concept of distance b'"'”" ““ function is beyond the 
lion of distance by means of a 
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Suppose one were interested in measuring the popularity of 
individuals and one observed who likes whom ’ The relationship 
A > B means that individual A is liked by the same people who 
liked individual B and some others in addition The data might 
be expected to yield a partial order To construct a simple order, 
one needs only to assume that being liked by one individual is 
equivalent to being liked by any other individual Then immedi 
ately the popularity of A is measured by the number of people who 
like A and at least an ordinal scale has been constructed (a certain 
abstract equivalence to the procedures of mental testing should be 
evident here) 

The process indicated here is not an unusual one' It is because 
of precisely such procedures that operaiionism (2) was given so much 
attention Operationism is the principle that the meaning to be 
given to a concept resides m the operations which give rise to the 
measure of the concept Aside from the philosophical reservations 
one might have about operationism, the warning must be given m 
passing to avoid operationism in reverse By operationism m reverse 
IS meant endowing the measures with all the meanings associated 
with the concept In the preceding example the concept of socio 
economic status would mean income, with a unit of education 
counted as so many dollars A different operational definition 
would be a different concept— the equivalence of education to dollars 
is part of the definition 

Attempts were made above to construct an ordinal scale by 
mapping a partially ordered attribute inio a simple order We saw 
that this mapping required a decision to be made on the exchange 
rate between the components and that every different exchange rate 
would result in a different simple order for a large population If 
the experimenter insists on napping a partial order into a simple 
order, the best he can usually do is to set up an objective com 
municable rule for the substituting of components and impose it 
on the data 

One atternati%e procedure is to study the partial order itself 
Ii IS possible, for example to build a partial order by combining 
tvso or more simple orders in certain ways Thus there is the sug 
gcstion that a partial order may be decomposed into simple orders, 
a process analogous to factor analysis This problem appears sohi 
blc (10) and consiiiuies a generalization of multiple" factor analyst* 


Theory and Methods of Social Measurement 


479 


ordered metric, there must be information in the data which leads 
to such conclusions as that the difference in authority between a 
corporal and a buck sergeant is either greater than, less than, or 
equal to the difference in authority between a private and a corpora . 
n the raw data consist of “who bosses whom, it raig t e possi e 
to construct an ordered metric rationally. Suppose, for exarnple, 
that a buck sergeant had command over two corpora s eac 
whom commanded twenty privates. The buck sergeant wou ex 
ceed the authority of a corporal by two corporals pri- 

vates. Now consider the step from a buck sergeant ° 
sergeant. Suppose that the master sergeant had comman 
buck sergeants, each of would then 

^Ck se.gean. by e^hty P^vams. by tour 
corporals, and by tLee buck sergeants. With respec “ 
ment of authority, as deffned Iierc. the exceeds the cor- 

the buck sergeant by more ^ ^„'7hat the increment in 
poral, and the conclusion could be draw increment 

authority from corporal to buck sergeant is es 

from buck sergeant to scale of authority. 

In order to all squads of twenty privates 

certain assumptions had to be made. ;„.„chan!teable, and all 
were interchangeable, all “''P'’"'' , within efch class, there 

buck sergeants were mterchangeabl . ’ converted 

were one or more units of measurem assumptions lead 

into a common unit. If the “““■“'"‘L t,- as this one may well 
to an ordered metric which is bad and to seek 

be, one would tend to regard the ass p assump- 

other assumpUons. The o~nized or not recognized, 

tions, whether implicit or explia • ? "authority" in that 

constitute part of the operational .j^Qj-iiy” and hence deter- 

they lead directly to a “measure o "authority*' has when so 

■nine part of the meaning that the concept authority 

measured. , "authority" has been con. 

Once an ordered ;„,„e 5 tinE to observe whether 

structed in one way. rco^tructed here or whether, 

people perceive the metric relations as different metric 

perhaps for psychological reasons, pe p F adaptation of 

relations. By use of the Method o 
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scope o£ this chapter, and the concept of distance between classes 
will be dealt with here on a purely intuitive level. In the scales 
discussed so far, all relationships such as “greater than” were be* 
tween objects, which were stimuli or individuals. Consequently 
there was a complete absence of such a relationship between any 
pair of distances between objects. Thus, although A may have been 
observed to be greater than B, and D greater than C, nothing has 
been introduced concerning whether A was greater than B by a 
larger amount than B was greater than C. From this point of view 
the scales introduced up to this point may be regarded as nominal 
scales with respect to distances between classes of objects. 

This introduces the concept of composite scales in which there 
are relations between the objects and in which, in addition, there 
are relations between the distances between objects. The nominal 
scale first discussed may be seen to be a nominabnominal scale, for 
it refers to objects first and distances between them secondly. Simi* 
larly, the partially ordered scale is a partially ordered-nominal scale, 
and the ordinal scale is an ordered-nominal scale. 

It immediately becomes evident for any one of these scales that 
if, in addition, a relationship is observed to hold on the distances 
between classes, then a higher level of measurement is achieved. 
When a relation of “greater than” holds for some pairs of distances 
between adjacent objects on an ordinal scale, the scale is an ordinal* 
partially ordered scale. And if this relation holds for all pairs of 
such distances, the scale is an ordered-ordered scale. 

For the sake of simplicity of discussion we shall not distinguish 
between those two levels but shall refer to them together as an 
ordered metric scale. By an ordered metric is meant a scale of which 
it can be said of any triplet of classes that a > & > c and also that 
for at least some intervals between classes, e.g., the intervals 
Tc ' • ' ij • • • XI • • • , either ^ H or H > i;, where in general, Jk 
signifies the distance from ; to k. 

To illustrate the idea of an ordered metric we shall take the 
previous^ example of a simply ordered scale of "authority” and 
build it into an ordered metric scale. For convenience of discussion, 
it will be best to give names to the equivalence classes on the 
ordinal scale. Let these be, in order of increasing authority, private, 
corporal, buck sergeant, and master sergeant. To establish an 
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(C 5 ), each larger than the step between corporal and buck sergeant 
(SC), were, in these data, incomparable. The simplified techniciiic 
used did not yield data which contained this information; hence, 
no conclusion can be drawn. 

In the process of constructing an ordered metric lor 
"authority," we have actually constructed two of them, one by defi- 
nition and the other as "perceived." The question naturally arises 
as to whidi one is "better" or "right." This gives rise to a basic 
question: What is "authority," anyway, and, for that matter, jus 
what is an "attribute"? It will not be profitable to pursue this subjec 
here; instead it will be sufficient to point out that this is where t 
doctrine of operationism plays its role. The concept o au 
has precisely such meaning as resides in the operations invo 
observing it! To endow either of these scales with S 

and implications associated with the concept o au y 
operationism in reverse and therefore specious. ji.mssed 

To summarize, the types of scales which fave been d^cussed 
are the nominal scale, the partially ordered scale, ‘be t>rd-nal 
and the ordered metric. In this order they , ,3 

more powerful levels of measurement, in the sense ‘b 
satisfy successive levels contain more and more m or 

The Interval Scale 

The level of measurement represented by ^ 

involves a step above the ordered metric . ordered 

siderable increase in "power." It will be reca e on a 

metric was characterized by a simple ordering o . , j 

scale and by at least a partial ordering on the 
distances between adjacent stimuli on the sea e. ; ^n 

is characterized by the fact that the "J^^sra^e are. 

just hou large the intervals between a number to 

This requires a distance se^Op^ralffimtliy this condition 

all pairs of elements in an ordered • constant unit of 

« satisfied by the existence associated with the 

^measurement. In such a case num y operations of 

positions of the stimuli on the scale such that the p 

as. s. Slcven, wa, the Bnt lo dWngubh b«w«„ and name 
and ratio scales (29). Sec also Stevens (30). 
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the Unfolding Technique (6) (discussed in the last two sections), it 
is possible to take a single individual subject and determine the 
"structure” of the attribute "authority" as he perceives it for these 
stimuli. It can readily be determine^ whether his perception of 
these ranks satisfies a simply ordered , system and what some of the 
metric relations are. 


Let A = private 
B = corporal 
C = buck sergeant 
D = master sergeant 

To illustrate the concept of an ordered metric scale, these four 
stimuli were administered by the Method of Similarities to two 
individuals and each individual’s structure of the concept of 
"authority" was obtained. The stimuli were presented to the sub* 
ject three at a time and he was asked to judge of each triad which 
two were most nearly the same in authority and which two were 
least alike in authority. These data could be unfolded for each 
individual and certain characteristics of the stimulus space deter* 
mined. In this instance, the scale of "authority” for those four 
stimuli was unidimensional for each of the two individuals who 
were tested, and the rank order was A, B, C, D. This, of course, is 
hardly surprising. 

^ The information in these data on metric relationships between 
stimuli can be represented by tlie following partial order, which 
^vas the same for both individuals: 



DC 


where in general. JK signifies tlie distance from stimulus J to stimu- 
lus A. and a line connecting two dements signifies that the one 
above is the larger. We see that the increment in "authority” in 
going from corporal to buck sergeant {BC) was psychologically the 
smallest of all. The increments represented by going from a pruate 
to a corporal (ifB) and from a buck sergeant to a master sergeant 



Theory and Method! of Social Measurement 483 

to all the scale scores and the relations between intervals will be 
preserved. This is called "translation” and corresponds to the fact 
that the origin, the location of the numerical value "zero," is arbi- 
trary. Similarly all the scale scores may be multiplied by any given 
number and the relations between intervals will be preserved. This 
is called "scalar multiplication" and corresponds to the fact that 
the unit of measurement is arbitrary. A linear transformation con- 
sisting of both a shift in origin and a change in the unit of measure- 
ment will also preserve the relations between intervals. 

This is an important consideration if one wants to compare 
the authority of a sergeant with, for example, the authority of a 
foreman of a repair crew. In the latter case, the unit o aut ori y 
might be chosen to be "one repairman." To be meaningful, le 
comparison then requires that the origin, zero, be t ^ 
scales and that the unit of measurement be identical. This « t e 
reason why one cannot compute the significance of a d 
between the mean height and the mean weight of a S^P 
As numbers are being used in each case there is ^ «^ncy o appjy 
the operations of arithmetic because this is possi 
in the abstract system of numbem: but the 
operations on the numbers do not necessarily av g ^ 

real system of objects which the numbers represe • I 

of si^ificance to be valid, both scales must he "d onjlm 
same interval scale. Under such conditions the in j, 

^he valid use of most of, the tools of “"'|i““Xes no. 

is interesting to note that the product momen c 
require that both variables have the ^pder inde- 

urement because this index of relaiionsh p 
pendent linear transformations of the vana es. 


he Ratio Scale 

^ the various levels of measure- 

To complete this discussion of 

oent, the ratio scale will be presenic J . .|^at the origin is 
'^terval scale with the additional chara ^ circumstances, 

‘tt absolute zero and not an arbitrary zeru. gj.g|y on the dif* 

he operations of arithmetic are [he scale values 

‘rences ■'* **"■ — interval scaic, _ _ 


ations of arithmetic are on ihe scale values 

—'.v.i,, as was the case in the interva » values on a ratio 
^mselves. The numbers associated with scale 
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arithmetic may be meaningfully performed on the differences be- 
tween these numbers. 

It will be easiest to illustrate the concept of an interval scale 
by returning to the ordered metric previously constructed for 
"authority” and to try to convert it into an interval scale. To do 
this, a decision must be made as to what shall constitute a common 
and constant unit of authority for all classes or ranks. Suppose one 
decided that command over one private constituted a unit of 
authority, and that a corporal would thus represent twenty units 
of authority. A buck sergeant who commanded two corporals each 
commanding twenty men would, on this basis, be regarded as having 
eighty units of authority. A master sergeant, commanding three 
buck sergeants and all their men, would then represent 480 units 
of authority (120 privates at one unit each, six corporals at twenty 
units each, and three buck sergeants at eighty units each). 

The scale scores of those grades or ranks could then be repre- 
sented as follows: 


Scale score* 
1 

20 

80 

480 


Rank 
private 
corporal 
buck sergeant 
master sergeant 


As this is an interval scale, differences between scale values 
may be operated on arithmetically.- Thus, for example, it can be 
said that a buck sergeant has sixty more units of authority than a 
corporal, a master sergeant has 400 more units of authority than 
a buck sergeant, and consequently the latter increment in authority 
is 6^ times the increment from a corporal to a buck sergeant. 

A significant aspect of an interval scale. is that the numbers 
associated with the points on the scale are "right” only within a 
linear transformation. For example, any given number can be added 

♦ "niis again b an example of how a concepc h endowed with meaning bjr 
meaiunng it There are other definitions which could be made which would 
reiuU In nonlinear transformations of these measures. For example, one might 
argue that a buck sergeant commands only the corporals and not the men 
under the corporals and similarly that the master sergeant commands only the 
buck lerge^tj. The sate scores would then I* 1 . 20. iO. 120. Then if •'authority'* 
so measure tore a linear relation to some other variable, •'authority** as 
meaiured in the text above would have a nonlinear relation to that variable. 
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mains of “objects" in the real world and their observable properties 
and relations. Measurement may then be regarded as t e process o 
mapping a real object sjstem into one of these abstract system^ 
purpose of the mapping being, in part, to substitute operatic s 
the abstract system for operations on the object system. H^ever. 
in order that the operations in the abstract system ave mca 
terms of tlie real object system, it is necessary that the axioms on 
which the abstract system has been built be satisfied by the object 

'’'“'xo say that measurement may be regarded ** ™Yan lha" 
object system into'an abstract system is both more and le s than wha 
is someLes .meant by measurement. From a limited to a general 
sense measurement may mean: , 

(1) mapping, an object system into an ! 

•pe^itting the assignment of numbers ^j««, and per 
•mitting at least some of the operations of arithmetic 
performed on these numbers. ,{mnle order. 

B .n .bi« -j- “-.tsi., r£d 

d including the ordered metric, imervai 
. (3) l*^*eneralizauon to the extent of majiping an object system 

intd a partial order or even a nomma^^ .^,^^ 

(4) a generalization ‘"."'“‘{’"S j„ egeci, measurement in 

' r orders into sets of sunple orders m eii 

■ a ihultidimensional space. ,hcorv includes the 

Under this last generalization, measur jjgujd be. for data 

entire subject of analysis of data. This is theory 

analysis is an integral part of the logic of measurement theory. 

Uuemma of t)ie Social Scientist 

It is now.evident that the process ol -“"lyrr^^rtn’.o 
Part.of selecting a level of ’ „r sedi systems is avail- 

'vhich the data are to be mapped. A Y ^viomat^c basis of 
able, varying in level from vveak a theory about the 

^lie level of measurement selected co relationships that 

behavior in question in that the relalionsliip. These 

to hold in the data and the proper 
axioms then become part of the data in i 
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scale are then "right” within scalar multiplication, a consequence 
o£ the fact that only the unit of measurement is arbitrary. Under 
these conditions, it is possible to compute a meaningful ratio of 
two scale values Suppose, for example, in the case of the interval 
scale of "authority" previously constructed, that zero could be prop- 
erly associated with the scale value of a prisoner of war* as an 
absolute zero of authority.,Then the scale values can no longer be 
translated but can only be multiplied by a scalar. The buck sergeant 
can then be said to have four times the authority of a corporal, and 
a master sergeant to have 24 times as much authority as a corporal. 
These relations would, of course, be preserved under scalar multi- 
plication. 

Mtasuremenl vj. Scaling 

Because of the considerable inaement in power between an 
ordered metric and an interval scale, there is a tendency to distin- 
guish between two broad classes of scales The theory of the ordered 
metric and less powerful scales may be referred to as scaling theory, 
and the theory of the interval and ratio scales as measurement 
theory (7). The former may also be thought of as qualitative 
measurement (if this is not a contradiction in terms) and the latter 
as quantitative measurement. Because of the composite character of 
the scales with respect to their logical structure, there is also a 
natural tendency to refer to the entire domain as measurement 
theory. 

From the presious discussion of levels of measurement, it should 
now be apparent that there are two broad aspects to measurement. 
On the one hand there is an abstract or formal system of elements 
with certain properties and operations. Each such system is a mathe- 
matical system, a calculus, in the sense in which Carnap uses it (4). 
The successive levels of measurement correspond to successive sys- 
tems in which there are, axiomatically, more properties and oper- 
ations. In these cases the result is a successively stronger system in 
the sense of the proliferation of the mathematical structure that can 
be built on the^given axiomatic basis. 

The second broad aspect of measurement consists of the do- 

3 Thu, of course. Ij not expenmenially demonstrable by any existing 
technique and could be accompluhed only by assumption 
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mains ol "objects" in the real world and their obsenable propert' 

and relations. Measurement may then be regarded as t e process o 

mapping a real object system into one of these abstract 

purpose ot tlic mapping being, in part, to substitute 

the abstract system for operations on the object system. Ho sever 

in order tliat the operations in tlic abstract system ave 

terms of the real object sy stem, it is necessary i ‘ ^ “ obiect 

which the abstract system has been built be satisfied by the object 

'’'“to s.ay that measurement may be regarded 
Object system into-an abstract system is both more >e « than wha 

is someLes>meant by measurement. From a hm.ted to a general 
sense measurement may mean: , , 

(1) mapping an object system into an ! 

. -perilling the Lignment ot numbers - f j 

-milting at least some of the operations of arithmetic 
performed on these numbers. ejmnle order, 

(2) mapping an object scale, and ratio 

. including the ordered metric, mierva 

(3) rgene^aliration to the extent ot majiping an object system 

intd a partial order or even a o( partial 

( 4 ) a generalization egeci, measurement in 

■ .orders into sets of simple orders m enc 

■ a thultidimcnsional space. ,v,.„rv include: the 

Under this last ""35 it should be, for data 

entire subject of analysis of data. Tn rpmpnt theory, 

walysis is an integral part of the logic of measurement theory 


"^he Liiiemma of the Social Scientist 


lU vj me 

■ It is now evident that the process of “mcTlyrmrtn'to 
part, of selecting a level of “of sud, systems is avail- 

which the data are to be mapped. A variety .^'^3,;3 b3sis of 

able, varying in level from vveak ™ 3 ,|,eory about the 

the level of measurement selected c relationships that 

behavior in question in that the axioms speci y These 

are to hold in the data and the properties oft 1 P 

axioms then become part of the data ... the sense that tney 
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precedence over the data. If there should exist any data incoin. 
patible with the axioms of the level of measurement selected, 

these data constitute error, by definition. 

Let us illustrate this by a hypothetical example of rating 
leadership ability. Suppose we have a group of individuals whose 
leadership abilities it is desired to determine, and suppose a group 
of judges who know each of the individuals are asked to rank order 
them in leadership ability. We shall require that leadership ability 
be measured on an ordinal scale. This requires that if A has more 
leadership ability than B, and B has more than C, A must have 
more than C, for all triplets, A, B. C. Inasmuch as we required each 
judge to rank order the individuals, our method of collecting data 
has imposed a simple order on leadership ability as far as each judge 
is concerned. But we find that the judges do not all agree on the 
rank order of the individuals in leadership ability. Consequently 
we must make some further assumptions, usually pertaining to ran- 
domness of erron, a constant origin, and a constant unit of measure- 
ment. Such assumptions as these will make possible the combination 
of a number of different rank orders into the single rank order 
which is required. One may then return to the data and specify 
in detail the “errors’' which each judge made. It would perhaps 
be interesting to face the individual judges with their alleged 
“errors” and study their reactions. 

The problem implicit in the situation can be called the dilemma 
of the social scieniist—which in its simplest form is*the problem 
of what shall be called error. Almost anyone is willing to say that 
any given set of data contains some error, but just what i is to be 
classified as error depends a good deal on the level of measurement 
assumed to hold in the data. 

The social scientist is faced by his dilemma when he chooses 
between mapping his data into a simple order and asking his data 
whether they satisfy a simple order. By selecting a strong enough 
system, the social scientist can always succeed in construaing a 
unidimensional scale of measurement, commonly an interval scale, 
thus requiring a portion of the data to be classified as error. By 
not requiring a strong system, the social scientist permits the data 
to determine whether a simple unidimensional solution is adequate. 
Unidimensionality, obtained by a method of analysis which guaran- 
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tees it, obviously cannot thereby be shown to be a characteristic of 
the behavior in question. This is merely a special case of a more 
general principle that no property of data can be said to hold unless 
the methods of collecting and of analyzing the data permit alterna- 
tive properties to exhibit themselves. The problem of the social 
scientist, in blunt terms, is whether he knows what he wants or 
whether he wants to know. 

There are several reasons why social scientists so frequently 
choose a strong level of measurement rather than a weak one to 
represent the data. It is ahvays more profitable to use a strong scale 
in preference to a weak one when both satisfy the data, because 
a iiiorc powerful mathematics is available for use in the description 
and analysis of the data. Compare, for example, the rank-order cor- 
relation methods appropriate to an ordinal scale with the powerful 
systems of linear, multiple, and nonlinear correlation methods which 
require an interval scale. There is no wonder that measurement 
by an interval scale has been a major objective. 

There is another, and in some ways a curious but valid, reason 
for the social scientist to choose a stronger level of measurement 
than is satisfied by the data~-$ociety often requires that at least a 
simple order be imposed on an attribute. Thus, in the case of the 
esthetic merit of paintings, an individual may be faced with the 
problem of choosing a painting to buy. In order to make a decision, 
he must, in spite of the existence of only a partial order, impose a 
simple order over at least a segment of the space. This conflict 
between what appears to be the inherent nature of social-psycholog- 
ical attributes as revealed by the data and the common insistence by 
society that at least a simple order be imposed lies at the root of the 
problem which the social scientist faces. This is the primary explana- 
tion of why the social scientist must so frequently be "unscientific" 
and, in effect, be forced to treat his measurement theory or scale 
as "right" in spite of his data. 

This situation, the common social necessity of mapping a partial 
order into a simple order or stronger system, has no unique solution 
except by fiat. There may be a number of different "best” solutions 
for different purposes, but there appears to be no single universally 
"right" solution. This enforced mapping of a partial order into a 
stronger system may be one of the sources of sodal conflict. 
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The major implication to be drawn from the discussion in this 
section is that it behooves the sodal scientist to become fully aware 
of and acquainted with the subject of measurement theory. There 
arc no pat answers or conventional rules which can be applied 
routinely and in the absence of understanding. The "conclusions 
that are dratvn from an analysis of data are highly dependent on 
the level of measurement assumed and imposed on the data; "con- 
elusions” may be easily confused with postulates or assumptions that 
have been built into the data. A decision must be made in terms 
of the objectives of the specific study as well as on the basis of the 
logic of measurement theory. 

This section has discussed the general nature of measurement 
theory, the various levels of measurement, and what is involved in 
the process of measuring. These considerations, combined with the 
theory of data presented in the next section, will provide a frarae- 
svork for understanding and relating the various methods of collect' 
ing and analyzing data discussed in the last two sections. 


A THEORY OF DATA 

There is a great variety of techniques and procedures used 
by social scientists (or collecting and analyzing data. All too fre- 
quently, studies which should have mutual implications are not 
comparable because their methodologies are different. The pur- 
poses of the theory of data presented in this section are to provide 
a simple framework for organizing and classifying the methods of 
collecting and analyzing data and to provide a mathematical basis 
for relating methodologies within classes and between classes. 


Genotypic and Phenotypic Levels of Description 

In the theory of data presented here, two levels of description 
will be recognized-a genotypic level and a phenotypic level. The 
phenotypic level refers to the observed or manifest behavior; the 
genotypic level to an inferred, hypothetical, latent level of behavior 
underlying or generating the phenotypic level. 

These two levels of description can be illustrated by the per- 
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fortnance of an individual on a mental test. On an arithmetic test, 
for example, his behavior on each item (usually categorized as right 
or wrong) constitutes the manifest behavior, the phenotypic level. 
On this level, performance is commonly represented by a pattern 
of responses— passes and failures— or by a numerical score based on 
the number of items right. Front the manifest behavior of ihi* 
individual and of a number of other individuals inferences of a geno- 
typic nature are usually made— for example, that individual A has 
more of the ability than individual /J. The manifest behavior is 
implicitly regarded as a function of the individual’s genotypic abil- 
ity and certain characteristics of the stimulus situation. On another 
test, also measuring arithmetic ability, the individual may get a 
quite different score, but a set of such scores from a number of in- 
dividuals are conventionally expected to have at least a monotonic, 
if not linear, relationship to the scores of these same individuals on 
the first test. 

In a different context, suppose that the more aggressive of two 
individuals always dominated. A given individual with a particular 
amount of aggressiveness (genotypic level) might behave submis- 
sively (phenotypic level) in the presence of one individual and be 
dominant in the presence of another. Some of the assumptions 
involved in such a line of reasoning have been formalized and made 
a part of a basis for a general theory of psychological scaling (8). 
The theory of data presented here, and all of the relationships 
between methodologies contained in the next two sections, are a 
consequence of the general theory. In this chapter the more technical 
treatment will be avoided and the presentation will be made on a 
verbal level. 


The Information Contained in Observations 

There are two fundamental aspects to this theory of data— a 
definition of the information contained in an observation on the 
phenotypic level, and a definition of the relationship between the 
genotypic level and the phenotypic level. These two aspects provide 
the basis for making genotypic inferences from the observations. 

Let us consider what the information is that is contained in 
an individual’s performance on an arithmetic item. Genotypically. 
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the stimulus,® an arithmetic problem in this case, is regarded as 
requiring a certain minimum amount of some ability for it to be 
successfully solved or passed (the ability is spoken of in the singular 
but it may be constituted of more primitive abilities). The individual 
is regarded as possessing a certain amount of this ability. His mani- 
fest behavior consists in either passing or failing the problem. The 
assumption is made that this implies that the individual has more 
or less, respectively, of the ability required.^ The information in 
each element of behavior, a response to an item in the test, is 
whether the individual has more or less of some ability than the 
amount of the ability required by that item. 

This is the kind and the amount of information which will be 
assumed to be contained in the behavior of individuals in responding 
to this type of item. Exactly what this type of item is will be discussed 
later when these ideas are generalized. 

There are other ways of collecting or observing behavior and 
other types of items. These procedures may differ both in the kind 
and in the degree of information they contain. Consider, for exam- 
ple, the manifest behavior of an individual in stating whether or 
not he will endone certain opinions. A statement of opinion may 
be regarded as representing a particular degree of attitude and 
possibly different degrees on different attitudes. On each of such 
attitudes, the individual has a specific degree of attitude himself 
at any moment of time. The degree of attitude held by an individual 
will be called his “ideal" for that particular attitude. Whether or 
not an individual endorses a given item is then determined by 
whether he regards the item as being “sufficiently close” to his ideal. 
If the individual is asked to choose the three items he most prefers 
to endorse, it will be assumed that he chooses the three items near- 
est his ideal. 


It is apparent that the information contained in such data is 


distinct from the kind of information contained in the example 
of the arithmetic test. In the arithmetic test, the individual 
passed all items requiring less ability than he possessed; in the 


8 VbeJher behavior arUes in response to other individuals, group litua- 
Uons, Internal ph)-siological states, or whatever, these may each be regarded as 
* ** *tmplic!ty h will be called a stimulus. 

TThe items being discussed do not pennit chance success by guessing: in 
other words, they are not objectlve-iype liras. 
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case of the attitude items, he "endorsed” those nearest to him. To 
generalize these ideas somewhat, the individual’s ability may also 
be regarded as an "ideal": he passed all the arithmetic items on one 
side of his ideal and failed all those on the other side. In the case of 
the attitude items, he endorsed items near his ideal regardless of 
direction and rejected items further away. 

In these two instances, the manifest behavior is abstractly the 
same. The stimuli are classified for each individual in two piles: 
those he passed, or endorsed and those he failed or rejected. The 
assumptions by means of which genotypic inferences are made, how- 
ever, are distinct in these two examples. 

With some methods of collecting data, the information in the 
manifest behavior differs in degree from the above. For example, 
the individual may be asked to rank the statements of opinion in 
the^order in which he prefers to endorse them. On the basis of the 
assumptions already made, it follows that the stimuli are ranked 
in Older of their increasing distance from his ideal on the resjiec* 
live attributes. 

It is obvious that the information in these data differs in kind 
from the information in data based on which items the individual 
is actually willing to endone. In the rank order data it is not known 
whether the individual would be willing to endorse any of the items 
or all of them. It is also apparent that the information in the rank- 
order data differs only in degree from that contained in data based 
on which three items the individual would prefer to endorse. The 
three items the individual would choose are, on the basis of the 
postulates, the first three items in the rank order. 

Task A and Task B 

The kinds of behavior which have been discussed up to this 
point have one thing in common from the point of view of the theory 
nf data. They all involve the evaluation of stinm/i with lespect to 
an ideal. In each Instance both the stimuli and the individuals 
have hypothetical genotypic measures and the manifest behavior 
permits inferences to be made about the relationship of the genotypic 
magnitudes of the stimuli to those of the individuals. In the later 
sections of this cliapter, particularly in the l.isi section, we shah see 
what conditions are necessary to make inferences about the geno- 
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typic relationsliips between individuals and the genotypic relation- 
ships between stimuli. 

For didactic purposes, data involving the evaluation of stimuli 
with respect to an ideal will be referred to as data collected by task 
A. All the behavior discussed up to this point, then, is task*/! be- 
havior. It should perhaps be pointed out that it makes no difference 
whether an attribute is explicit or implicit In the instructions to 
the subject. Thus, it is still task A whether an individual is asked 
which candidates he prefers with respect to their attitudes toward 
foreign affairs or whether he is merely asked which candidates he 
prefers. This does have a bearing on the interpretations given to 
the genotypic inferences, but in the formal analysis of data it is 
irrelevant. Similarly, it makes no difference in the abstract system 
whether the individual’s ideal is his own, someone else's, or one 
given to him by the experimenter. The abstract characteristic of 
task A is that a stimulus is evaluated with respect to dirchion or 
distance* from a point in space called an ideal. ' 

There is another kind of behavior which can be observed and 
witli respect to which data can be collected. This type of behavior 
is evaluation of stimuli with respect to an attribute and will be 
called task D. Illustrations are contained In the evaluations of state- 
ments of opinion as to which candidate expresses a more liberal 
attitude or which candidate Is more pro-union, or in rating the 
aggressiveness of an individual. Whether or not the judge li.is an 
ideal of his own on liberalism, etc., is regarded as irreleva'nt> 

Data collected under task B may also differ both in kind and 
degree. When an individual ranks .a number of individuals as to 
their administrative capacities, there is no information in these 
data as to whether or not any of the individuals are good administra- 
tors, as would be implied in the case of rating them. This is an 
example of a difference in kind of information. An example of a 
1 1 erence in degree of information is picking the three best admin- 
istrators in a group of individuals; this contains less ’information 
than the rank ordering. 


n a mnltidimensiona! space, the deSnition of disunre ma) lie made 
in ways This problem is too simplex and has not been stifficiently 

or e out or discussioii in this chapter. In the unidimcnsional 'rase, these 
alternative definitions of ilisiance do not arise. 
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Relatwe and Irrelative Behavior 

In this discussion of the theory of data, an explicit dichotomy 
has been made between task A and task B with respect to the nature 
of the information contained in the data. Certain other dichotomies 
have been made implicitly. One of these may be referred to as relative 
and irrelative behavior. In relative behavior, the data are based on 
the relationships between two or more stimuli-for example, a judg- 
ment as to which of two candidates an individual would prefer 
(task A), or which of the candidates is more pro-union (task B). If 
the individual rank ordered his preference for all the candidates, 
it would still be relative behavior, task A, and the information in 

IRRELATIVE RELATIVE 



Fic. I. A c/a«ificacion of methods of cohecting and ana^yrmff data. 


the data would differ only in degree from the information in a judg- 
ment on one pair. Obviously this can be extended to task B similarly. 
In irrelative behavior, » the individual’s judgments involve a 
single stimulus at a time. This is illustrated by the arithmetic test 
referred to, by rating individuals on a rating scale, or by expressing 
one's likes and dislikes for each of a number of items In an Interest 
inventory. 

The two dichotomies developed so far. tasks A and B, and rcla- 
and irrelative bchasior, may be put together in the form of a 
fourfold table (Fig. 1 ) in svhich, for simplicity of discussion, the 
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quadrants have been numbered. It will be observed that Quadrant 
11 has been further dichotomized on the basis of the kind of stimuli, 
monotone or nonmonotone. This distinction is illustrated by the 
different assumptions involved in making genotypic inferences in 
the passing or failing of an arithmetic item and in the case of 
whether or not a statement of opinion is endoned. In the arithmetic 
items, the individual’s manifest behavior is "pass" for all items on 
one side of him and “fail" for all items on the other side. This t)pe 
of item is called 'monotone. Nonmonotone items are exemplified by 
statements of opinion in which the individual endorses those items 
nearest him in a segment of the space surrounding him and rejects 
all items beyond. 

In the case of a single underlying latent attribute, the items 
an individual rejects may consist of two subsets, one of which con- 
tains those that are too extreme in one direction and the other those 
that are too extreme in the other direction. Consider, as an example 
of a nonmonoionc item, the following statement of opinion: "We 
should make the loan to Britain if we are sure they will pay it 
back." On a hypothetical continuum from pro- to anti-British, 
individuals in the neighborhood of the middle of this continuum 
would presumably endorse this statement. But the individuals who 
refuse to endorse this statement may be at opposite poles of the 
continuum. The very pro-British may say "no" because they want 
to make a loan to Britain without any conditions and the very anti- 
British say "no" because they do not want to make a loan to 
Britain under any conditions. Hence, two very different kinds of 
people genotypically behave identically (phenotypically) in this 
situation. Similarly, those individuals in the middle of the con- 
tinuum who endorsed this statement might be expected not to 
endorse extreme pro- or anti-British statements. Hence, certain in- 
dividuals will behave phenotypically the s?me to two genotypically 
distinctly different kinds of stimuli. 

Ill the abstract generalization of these two types of items, mono- 
tone® items are those for which a one-to-one mapping of the cate- 
gories of manifest behavior into genotypic categories is possible, 
a. id nonmonotone items are those for which a one-to many mapping 
is necessary. 

® A type of item called ’‘cumulative” is a special case of a monotone item. 
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Distinction Between Methods of Collecting and Analyzing Data 

We now have a framework within which methods of collecting 
and analyzing data may be discussed. Within a given quadrant 
(Fig. I), the kind of information contained in the data is the same 
and data collected by various methods differ primarily in the degree 
of information they contain. The placement of a set of data in this 
fourfold table will be seen to be. within certain limits, a decision 
that is made by the experimenter when he chooses a particular tech- 

nique for analyzing the data. , . r 

The method of collecting data determines what information 
they contain, but the method of analysis defines this information, 
and this is what situates the data in the table. Methods of analyzing 
data have been devised historically, ad hoc, for eac o t^ e qua ■ 
Some of these methods seek only to consolidate or averap 
phenotypic information: others seek to make genotypic in erences 
from the phenotypic information. 

The method of analysis selected may permu ‘’’"/"T/rLs 
the properties of the information or may also define the properties. 
In the latter case, the experimenter is concerned only w-th die .me - 
relations. This is precisely the locus of the dilemma of the social 
scientist referred to in the previous section. 

The distinction between methods of collecting data and meth- 
ods of analyzing data is imperative for an un ° 

relationship between the inferences drawn from ] 

The relationships among the quadrants are basic to an un is 
ing of the distinction between collecting 

also to the’ relationships among different metho s o co g * 
and different methods of analyzing data. tt ^ ttt\ u 

Irrelative behavior (represented by 
in the abstract, the response of an individual to s i p 
contrasted wilf. relative behavior (Quadrants I =>"‘’ .'2u!i Clearly 
the response of an individual is a choice between st.mu i. C eariy 
iirelative behavior is represented by ® ^ 

in iu broadest sense, and all the methods " 

Quadrants I and IV, relative behavior, can be referred to collectively 

" ^’Q^rant^distinction is possible between monotone 
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and nonmonotone stimuli, but since methods of collecting data in 
this quadrant invariably use monotone stimuli, nonmonotone 
stimuli have been neglected in this presentation. It can be shown 
that Quadrant III is formally indistinguishable from Quadrant 
Ila-the distinction exists only in the frame of mind of the experi- 
menter. Furthermore, it can be shown that the Method of Single 
Stimuli as a whole, both Quadrants II and III, is a special case of 
Quadrant I. 

The' distinction between Quadrants I and IV is a real one 
although there are data collected by certain methods which may be 
placed in either quadrant depending upon the objectives and frame 
of mind of the experimenter. These distinctions and interrelations 
will be brought out in more detail in the next two sections 


METHODS OF COLLECTING DATA 

In this section a number of methods of collecting data will be 
discussed and some of their interrelations pointed out. A potential 
source of confusion resides in the fact that the names for some 
methods imply both a method of collecting and a method of ana- 
lysing data— ff.g., the Method of Successive Intervals (28). Throughout 
this section the mention of any method will have reference only to 
its use as a method of collecting data. 

A general system for Structuring or organizing methods of 
collecting data in Quadrant I will be developed on the basis of the 
amount of information each contains. The relation of these methods 
when used to observe behavior in Quadrant IV will be pointed out. 
The sense in which the Method of Single Stimuli of Quadrants II 
and III is a special case of Quadrant I will then be discussed. 

Quadrant I 

To illustrate the information contained in methods of collecting 
data in Quadrant I, examples will be given in which unidimension- 
ality h^ been imposed Such examples will be used partly because 
the untdimensional case has been more completely worked out and 
also because of the preoccupation of most social scientists with uni- 
dimensional representations of data. 
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Let us suppose, throughout the follou.ng d.scuss.on tint ue 
have five stimuli. A. B, C, D, and E. on some latent attribute and 
that the ideals o£ individuals are distributed over this ‘“‘™‘ 

attribute rvith fixed metr.e relations For 

that the five stimuli are statements of opinion 'als are 

degrees of attitude toward and the ideals of ^ 
the hypothetical statements of opinion which eac ra, 

endone above all others This situation f™'" 

are unrealistic constraints, but the simplicity of the “"^mons j 
very desirable for didactic purposes These constraints 
pletely relaxed ,n the more 2 „hich a fre 



B C D E 

Flc 2 \n example of a d.sinbuiion of people and si.muli on a 
Joint scale 

continuum is indicated Such a scale rnay be in u 

scale or ,oint distribution. ^-"5 ^"‘VTskm ea^ nd^ to 
Suppose that one collects data >>y on the basis 

indicate the two statements he most pre e endorse a stimulus 

of the postulate that an individual will prefer to ^ 

nearer his ideal than one farther away, the response patterns 
these conditions would be the following 
AB, BC, CD, DE 

u . a,.. ... ‘ frsr;' 

between stimuli A and C would g ,,,e 

dividuals to the right of the mi po relationship of all 

midpoint BD would have the winch the bound 

respLses to midpoints is iHustrated in Figure 3. in , 

aries of the regions ^ response associated with a 

vertical lines sectioning ilie scale . , i 

tegion or segment of the scale is a so m 
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Fic. 3. Relation of response patterns and scale salucs undct 
••Pick 2-** 


This method of collecting data, cilled "Pick 2," in general yields 
n-1 categories of individuals, where n is the number of stimuli, and 
the boundaries of regions are given by the midpoints of alternate 
stimuli. 

If, instead of "Pick 2," the Individuals had been instructed 
to "Pick 3," the response patterns would have been 

ABC, BCD, CDE, 

and their relation to the scale would be as illustrated in Figure 4. 


AO BE 


ABC 

. BCD 

CDE . 

A 

6 1 

c 

r" 1 

1 D 

£ 


l. Relation of response patterns and scale values under 
•Tick sr 


^ he number of categories of individuals would have been three or, 
m general, n-2. This procedure, "Pick k/’ may be continued for 
ig er values of A up to n-1. For "Pick n-l" only two patterns of 
response would be obtained. One pattern would be that of individ- 
ua s to the left of the midpoint between the fint and last stimulus; 
e er pattern would be that of individuals to the right of 
IS mi point. This might be more immediately obvious if one 
recognizes that from a formal point of view, "Pick n-I" is the same 
^ unidimensional case, there are only two stimuli 

which may be rejected, the two end ones, and the choice between 
V o*' their midpoint in relation to the judge's 

ideal. In the present example the results would be as indicated in 
Figure 5. 
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Fjc. 5. Relation of response patterns and scale values under 
“Pick n-l“ or “Reject I.” 


For each of these methods of collecting data in Quadrant 1, 
from *'Pick 2” to ‘Tick nd/* there corresponds a method of analysis 
which will ‘reveal certain characteristics of this genotypic structure 
underlying* the phenotypic behavior— the stated preferences. This 
method of analysis, called Parallelogram Analysis, is discussed in the 
next section. 

' If this method of collecting data. “Pick k,” were extended to 
A = 1. where each individual indicates the one item he will endorse, 
there is no unique solution to the genotypic structure underlying the 
preferences (see p. 513). “Pick 1" is the special me of the Method 
of Choice which corresponds to the Method of Single Stimuli. This 
is not so severe a criticism a 't may at first appear, because the 
Method of Single Stimuli is oiuinarily used only where there is an 
a priori ordering of the alternatives to an item and the data are not 
asked to provide it. Methods of analyzing such data then become 
concerned only with the interrelationships between a number of 
such scales as the above, one for each item. Consequently, Pick 
will be studied as a special case in its own right, under Quadrant I . 
Omitting “Pick 1," then, for the time being, let us return to "Pick 

2” and continue with relative behavior. 

If onr subjects are instructed to “Order 2" (i.e.. indicate their 
first and second choices) instead of "Pick 2," the response pauern 
dfl obtained in ."Pick 2" becomes two response patterns AS and SA 
rmd these are given by individuals to the left and right, respectively, 
of the midpoint AB. Similarly, each of the “Pick 2" p.ntlcrns be- 
comes two patterns under "Order 2" instructions, and the midpoints 
of adjacent stimuli on the genotypic scale have been added to the 
midpoints of alternate stimuli to form the boundariw of the regions 
associated with each phenotypic response pattern. Tins is illustrated 
in Figure 6. The number of categories of individuals has become 
eight instead of four, or, in general, 2 (n-1). 
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Fjc 6 Relationship of response pnttcriis n 1 stile \ iliits inukr 
Onlcr 2 


It IS now apparent that for n stimuli presented at a time the 
methods of collecting data from Pick n 1 to Order 2 (omitting 
Pick 1 ) contain a monotonicall) increasing amount of infonnaiion 
in the sense of the number of categories the individuals are classified 
into on the basis of distinguishable response patterns The method 
of analysis of information (the Parallclograrh Technique) leading to 
genotypic inferences is the same for all and will be discussed in the 
next section 

It is obvious that this senes of methods of collecting data ma> 
be extended to Order 3 Order 4 down to Order « 1 The 
latter corresponds to the Method of Rank Order These methods 
of collecting data may be distinguished from the preceding methods 
for although both types are logically contiguous methods where 
a number of items are ordered yield a new class of mformaiion— 
information about metric relations 

A detailed presentation of the theory and technique fo- extract 
mg information about metric relations in rank order data is con 
lamed in the literature (6) Here the rase of Ordei 3 mil be 
illustrated Keeping m mmd Figure 2 and the psychological postu 
lates which have been made consider what the data would be like 
for Order 3 All individuals to the left of the midpoint AB would 
yield ABC as their phenotypic behavior Individuals to the right 
of this midpoint AB but to the left of the midpoint AC would act 
alike phenotypically and would yield BAC Crossing the midpoint 
AC reverses the order of these two stimuli in the phenot>pic be 
havior and the next ordering would be BCA If this process is con 
tinued for the situation given in Figure 2 the complete results aiP 
those given in Figure 7 

In comparison with Order 2 where there were ei„ht classes 
of individuals or in general 2(n I) here there are ten or in genera! 
%n 5 But there is in addition a new kind of information in these 


I B I I i C I 

Fic. 7. Relation of response patterns 
"Order S." 


II D I 

and scale values tinder 


data— information about metric relations. Consider the two mid- 
points BC and AD. In this instance the midpoint BC precede AD 
because the interval between the two stimuli A and B, AB, is less 
than the interval between the two stimuli C and D, CD. The pheno- 
typic behavior associated with the interval bounded by the two 
midpoints BC and AD is CBA, and to phenotypic behOTior 
the genotypic implication that AB CD. If, for example, AB > CD 
and hence the AD midpoint had preceded the BC midpoint, the 
phenotypic behavior in the region they bounded would have been 
BCD instead of CB/1. 

Similarly, the phenotypic behavior CDE implies tot B£ pre- 
cedes CD. which has the genotypic implication that BC > DE. In 
general metric information from "Order 8” is obtained with respect 
to the relative magnitude of alternate single intervals between 


Stimuli. 

If “Order ft" is employed as a method of collecting data, the 
information contained in the data includes that for any lower '^ue 
nf ft and additional metric information as ft increases. For ft 4, 
the information increases to include the relative magnitudes of sums 
of adjacent intervals compared with sums of adjacent intervaIs.^T e 
method containing the most metric information is "Order n-l,” the 

Method of Rank Order. . .ir^ a 

For the sequence of methods of collecting data invo ving r er 
3 ft ^ n-1, there is a corresponding method of analysis for 
obtaining the genotypic inferences contained in the data, 
niethod of analysis is called the Unfolding Technique. T e ra ‘ 
order of the stimuli for an individual will be called a simply ordered 
I scale and may be regarded as the J scale folded at the mdmdua s 
ideal. It is this which gives the name the Unfolding y^-dmiquc o 
tbe analysis of sets of I scales (phenotypic behavior) to generate 


502 The Analysis of Data 

a J scale (a genotypic inference). The Method of Parallelogram Anal- 
ysis wll be seen to be a special case of the Unfolding Technique. 

The next two meihoi of collecting data to be considered are 
the Method of Paired Comparisons and the Method of Triads. The 
Method of Paired Comparisons for Quadrant I behavior constitutes 
the presentation of all possible pairs of stimuli with the instruction 
to the individual to indicate a preference within each pair. The 
Method of Triads for Quadrant I behavior constitute the presenu- 
tion of all possible triplets of stimuli with the instruction to indicate 
which is most preferred and which is least preferred. A further 
generalization is clearly possible but would be be)ond the scope 
of this chapter. 

In the Method of Triads, each pair of stimuli is judged ti- 2 
times, where n is the number of stimuli. Each pair of stimuli is a 
constituent of a triad with each of the remaining stimuli in turn, 
and when the individual says of the stimuli A, B, and C that he 
prefers A most and C least, he is placing them in a rank ordc? of 
preference A, B, C. This rank order of preference is equivalent to 
three transitive paired comparisons A'^B, B^C, A where 
">” means preferred. It is impoitant to note that this transitivity 
has been imposed by the method used in collecting the data. The 
judgments of an individual on a triad may be decomposed into 
three paired comparisons which have two de g r ee s of freedom. One 
has no way of telling, of course, which two paired comparisons are 
the independent ones. 

Because each paired comparison judgment between a given pair 
of stimuli is made n-2 times, it is apparent that the Method of Triads 
permits the consistency of a paired comparison judgment to be 
tested in the context of a third stimulus. The Method of Paired 
Comparisons, in which each paired comparison judgment is made 
only once, does not permit testing the consistency of a judgment but, 
assuming consistency, this method permits testing the transitivity 
of the paired comparison judgments. It is now apparent that the 
Method of Rank Order imposes both consistency and transitirity 
on the implied paired comparison judgments. 

Putting together these various methods of collecting data, we 
have a simple order of methods from “Pick n-l” to the Method of 
Triads, listed in order in Table I. At the top of this “power” struc- 
ture is the method among those discussed here that contains the 
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most information about the behavior being studied. The other 
methods of collecting data arc listed in order of decreasing informa- 
tion. For convenience of discussion, it is desirable to give a name to 
the implied attribute underlying this power structure. The term 
"searchingness” is suggested. If we measure a unidimensional attri- 
bute, the methods listed in Table I, then, go from the most searching, 
at the top, to the least searching, at the bottom. 

Tliere are two other data-collection methods in common use 
which should be included here. These are the Method of Equal 
Appearing Intervals (17) and the Method of Successive Intervals. 
They may both be regarded as adaptations of the Method of Rank 
Order in whidi ties in rank are permitted. In the Method of Equal 
Appearing Intervals, the subject is ins'»-ucted to place the stimuli 
in a given number of ranks, equally spaced psychologically; in the 
Method of Successive Intervals, he is merely given the number of 
ranks (piles) with no constraint on their “spacing.” These methods, 
although less searching than the Method of Rank Order, are not 
directly comparable to any of the oiKer “Order k" methods. This 
might be more evident if one recognizes that “Order k” for k < n-I 
yields a segment of one end.of an I scale. The Method of Equal 
Appearing Intervals and the Method of Successive Intervals yield 
the entire I scale collapsed into steps which arc or are not, respec- 
tively, assumed to be equal psychologically. 

A more thorough development of the power structure on meth- 
ods of collecting data, including methods not covered here, reveals 
that they are partially ordered with respect to the amount of 
information they contain about the behavior in question. 

There are some general implications and interpretations which 
follow from the power structure on methods of collecting data con- 
sidered from the general point of view of measurement theory. A 
selection of a method of collecting data and a method of analysis 
in any particular instance is dependent upon a resolution of the 
dilemma that the social scientist faces. Having resolved the dilemma 
to suit his purposes, he can select a method of collecting data and 
a method of analyzing it which will, for example, guarantee that he 
will end up with a unidimensional scale; or, if his objective is dif- 
ferent, that will provide a test of dimensionality. 

One of the basic issues in the interpretation of data is illustrated 
by the example of an individual who sajs, in three successive judg- 
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menu, that he prefers A to B, B to C, and C to A. One does not 
know whether the individual’s judgments are inconsistent or whether 
they are intransitive. TVith a technique as searching as the Method 
of Triads, it appears possible to make a distinction between behavior 
which may be classified as error and behavior which requires ex- 
planation. By behavior which may be classified as error is meant 
behavior of a single individual which is random over replications 
of the stimulus situation. If the Method of Triads revealed that the 
individual was consistent and intransitive, it is incumbent upon 
the experimenter to accept this as experimental fact, regardless of 
the behavior of this individual on other stimuli or the behavior 
of other individuals on these stimuli. All conventional methods of 
measurement or scaling which classify intransitive judgments or 
other portions of data as error make one or more of the following 
assumptions: (1) that different individuals are replications of one 
another for the same stimulus situation; (2) that different stimuli 
are replications of one another for a given individual; (3) that a 
theory (level of measurement) is valid in spite of the data. It is now 
evident that these assumptions arc neither necessary nor desirable 
unless the experimenter has resolved his dilemma by deciding to 
construct a unidimemional scale in spile of the data. 

When a data<olIeciion method less searching than the Method 
of Triads is employed, a distinction between inconsistency and in- 
transitivity in an individual's judgments is no longer possible unless 
one or more of the assumptions above are made. The Method of 
Paired Comparisons imposes consistency (reliability) on the judg- 
ments of an individual, and the Method of Rank Order further 
imposes that the paired comparisons be transitive. If the paired- 
comparison judgments of an individual are transitive, the data may 
be expressed as a rank order with no loss of information. But if the 
data are collected by the Method of Rank Order, transitivity of the 
paired comparisons has been imposed on the behavior by the method 
of observing ii and it is not known whether the behavior would have 
been transitive or not. 

As we move dorm the searchingness scale, there is a series of 
successively decreasing numbers of elements in the rank order. 'Where 
the number of stimuli is n, the individual is asked to give his rank 
order of preference only for the first k ranks where k < «. Those 



Theory and Methods of Sociol Measurement 507 


methods impose all the properties that the Method o£ Rank Order 
does but, because the rank order is incomplete, the missing segment 
is in effect determined for each individual by the judgments of the 
other individuals. In other words, since we have no information 
on the end segment of an individual's rank order, it immediately 
is compatible or in agreement with any obtained data. 'Thus, in all 
methods of collecting and analyzing data, that information in data 
not collected is always regarded as compatible with the information 


that was obtained. . ... 

If a simply ordered scale of stimuli and judges is desired, data 
with too much information in it may contain information incom- 
patible with the desire. By the use of a method of collecting daW 
which will provide less information, such as a "Pick k meth^ 
instead of an "Order k" a simple order of all the stimult may be 
constructed which is interred to hold for all. This illustrates a 
general interpretation that may be given the searchmgness structure 
on methods of collecting data. In one sense the searchmgness struc- 
ture may be regarded as a set of criteria for unidimensionality of 
behavior, the less searching being the weakest criteria and the most 
searching the most rigorous. Behavior which under a given criterion 
appears unidimensional will so appear for all weaker met o s u 
may or may not satisfy unidimensionality under the stronger criteria. 

There is another implication of this power structure on methods 
of collecting data-the fundamental principle that social-science data 
are worth no more than the "effort" expended by the judges in mak- 
ing their evaluations. This is illusti'ated throughout the power 
structure. It is easier for a judge, for example, to Pick 2 than to 
"Order 2.” The latter contains the "Pick 2” information and more. 
The principle is again illustrated by the relationship of the Method 
of Triads to the Method of Paired Comparisons. 

The Method of Triads for relative behavior, task d, is at the top 
of the power structure tor all the methods considered liere and is 
minimal in the properties it imposes on the data. In fact, t met o 
of collecting data permits “almost anything to happen, an t e 
inherent variability and other characteristics of ^ avior ar 

permitted to reveal themselves. The Method of Triads requires s 
much effort on the part of the judges, however, that it is imprac- 
tical for a large number of stimuli, and this is probably the primary 
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reason for iu being used so little. For the intensive investigation of 
Quadrant I behavior, over a moderate number of stimuli, it is the 
best of all methods presented here. 

Because of the large amount of information in data collected 
by the Method of Triads, a judiciously selected portion of triads 
can be substituted for the Method of Paired Comparison when the 
latter method appears too formidable for the judges. This results 
from the fact that each triad may be converted into three paired 
comparisons with two degrees of freedom. One of the objections to 
the Mfethod of Paired Comparisons is that it is tedious for an in- 
dividual over a large number of stimuli. For n = 20, the number 
of paired comparisons required of an individual is 190. The full 
Method of Triads for 20 stimuli would require the individual to 
make judgments in 1140 triads, and each triad is the equivalent of 
three paired comparisons with two degrees of freedom. It is possible, 
however, to select 63 of these 1140 triads which, with one additional 
paired comparison, would be equivalent to the 190 paired compari- 
sons, but the number of degrees of freedom would be 127 instead of 
190. If the 190 degrees of freedom were required, it would take 95 
triads which would be decomposable into 285 paired comparisons, so 
that some of the 190 different paired comparisons would be repeated. 

This presentation of methods of collecting data has by no means 
exhausted the variety. Enough has been presented to permit the 
construction of a simple power structure; one of the directions for 
further generalization has been pointed out; some of the implica- 
tions of the power structure have been indicated. 

Quadrant IV 

The presentation up to this point has been entirely in the con- 
text of Quadrant I, in which the individuals indicate comparative 
preferences between stimuli. These methods are, potentially at least, 
available for use in Quadrant IV also, with an appropriate titange 
in instructions. In this quadrant the behavior of individuals consists 
of the comparative evaluation of stimuli with respect to an attribute. 
Thus, for “Pick k," an individual would be asked to “Pick the k 
most aggressive children in this group.” 

The information in such judgments is not which of the stimuli 
is nearer an ideal of the judge on some underlying attribute, as in 
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Quadrant I, but which of the stimuli has more of some attribute. 
There is here, in principle, no assumed relationship between any 
ideal which the judge may or may not have and his task B judgment. 
Let this be made very clear. One individual may prefer candidate 
A to B and another one B to A, because each of these two judges 


has different ideals; for one judge, A is nearer his ideal; for the 
other, B is nearer. But if these two judges were asked whether A 
or B was more conservative, a better administrator or speaker, or 
more pro*union, etc., it is not assumed that their ideals or the hypo* 
thetical preferences of the two judges on any of these attributes 
have any relationship to their judgments. 

In concrete terms, it is assumed that two individuals, one in 
favor of and the other against universal military training, will not, 
for that reason, disagree as to which of the following statements is 
more favorable to universal military training; (1) All men at the 
age of 18 should take one year of military training. 

(2) All men at the age of 18 should be urged to lake one year 


of military training. 

The immediate objective of collecting data by the m^etht^s 
o{ Quadrant IV is to study the stimuli. Usually this means that the 
data are analyzed to obtain scale values for the stimuli on the 
attributes, commonly in the form of an interval scale. With such 
an objective most of the methods in the power structure would not 
be useful because the stimuli usually have sufficient spread on the 
relevant attribute that under "Pick k" the choices of most 
would be the same, and the stimuli not chosen cou not e sea . 
Hence the methods generally used are the more powerful methods 
which require each individual to make comparative ju * 

many re^ons of the scale. The« methods include the metho* of 
Equal Appearing Intervals, Successive Interva s, an r 
Paired Comparisons (33), and Triads. , „ • . ran- 

Data collected by these methods under task B ms 
^ot be analyzed by the Unfolding Technique, ecau ^ 

do not involve evaluation of stimuli with romuara* 

method of analysis used on such data is usually t e ^ One can 

<ive Judgment (33) or an '' “n°paimd«mparison 

of course, use this latter method of analys s p f jjjg 

iudgments collected under task “ .“1 "'/“.h r«uT« 

analysU will bear a distinct algebraic relationship to 
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obtained from applying the Unlolding Technique to the same data. 
The Unfolding Technique yields a genotypic analysis of preferences 
the Law of Comparative Judgment, or its modifications, yields a 
phenotypic analysis. These two analyses, then, have a predictable 
relation in this general theory. This is an example of a single set of 
data (collected by the Method of Paired Comparison, task A) which, 
depending upon the attitude or objective of the experimenter, may 
be located by his method of analysis in Quadrant I or Quadrant IV. 

To extend the Unfolding Technique to the analysis of Quadrant 
IV data a somewhat different method of collecting data must be 
employed. This method may be called the Method of Similarities. In 
this method the individual is presented with the stimuli three at a 
time as in the Method of Triads. The instructions, however, are to 
judge which pair of the three is most alike and which pair is least 
alike. The instructions may or may not indicate an explicit attribute 
— i.e., the instructions may be to judge which pair of a triad of 
cultures is more alike in their ethics, or the instructions may be 
merely to judge which pair is more alike. In either case the analysis 
of the data follows the same procedure. With the attribute explicit, 
however, the Method of Similarities will yield the latent structure 
underlying the stimuli for this attribute as perceived by an individ- 
ual. With no explicit attribute, the method will yield the latent 
structure underlying the similarities and differences of stimuli as 
. perceived by an individual. 

This technique was used by Richardson as a method of collect- 
ing data in his two-dimensional study of color (27). The method of 
analysis, however, was a carry-over of Thurstone’s Law of Com- 
parative Judgment. The method suggested here for analyzing these 
data is quite different— namely, the Unfolding Technique. This 
difference in method of analysis is a direct consequence of the 
definition of the information contained in the data. To apply the 
Unfolding Technique to such data, the information they contain 
is obtained as follows. The responses of the individual to each 
such triad may be converted into three paired comparisons, each an 
element of a different I scale. For example, suppose an individual 


responded to the triad 



with the statements that B and C 
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were most alike and B and D least alike. If the individual had been 
asked whether stimulus C or D were more like B, the foregoing 
judgments imply that the individual would have said C. In effect, 
then, the individual was taking stimulus B as his ideal in the 
stimulus space and was saying that “from the point of view of B, C 
is preferred to D." This yields one of the paired-comparison judg- 
ments that make up an I scale for the individual standing at the 
position of stimulus B. • 

In exactly the same manner his responses to this one triad 
provide one of the judgments for the individual's I scale when 
he is at stimulus C and one of the judgments for his I scale when he 
is at stimulus D. Thus, from the point of view of stimulus C, B is 
preferred to D; and from the point of view of D, C is preferred to B. 
From the individual's responses to the rest of the triads, an I scale, 
not necessarily transitive, may be constructed for the individual 
standing at each stimulus position. 

Klingberg (19), in his study of the hostility relations among 
sovereign states, had each subject rank order (n-1) states m order 
of decreasing friendliness from the remaining state. Each *** 
turn was used as the standard. This technique necessarily yield, tram 
sitive I scales, whereas the Method of Similarities may yield intransi- 


tive ones. . . 

Data from either of the above methods may be analyzed by t 
Unfolding Technique to study the latent structure underlaying me 
Stimulus domain for a single individual. It is apparent, t en, 
such a method of collecting data in Quadrant IV rontams in o 
tion which permits it to be mapped into Qua rant , ® 
roethods of analysis appropriate to Quadrant I may e “PP 
*ese data. In the power structure of searchingness, the 
Similarities corresponds to the Method of Paired Comparisons and 
Klingberg’s technique to the Method of Rank Order. 


Quadrants 11 and 111 

In the power structure of methods ^^MethiS if 

“ a method was mentioned as a special ca ,j.2)is will 

Choice, corresponding to the method of S.ngle Sttmuh. Th.s 

now be discussed in more detail. heliavior monotone 

Consider an example of Quadrant III behavior. 
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obtained from applying the Unfolding Technique to the same data. 
The Unfolding Technique yields a genotypic analysis of preferences 
the Law of Comparative Judgment, or its modifications, yields a 
phenotypic analysis. These two analyses, then, have a predictable 
relation in this general theory. This is an example of a single set of 
data (collected by the Method of Paired Comparison, task J) which, 
depending upon the attitude or objective of the experimenter, may 
be located by his method of analysis in Quadrant I or Quadrant IV. 

To extend the Unfolding Technique to the analysis of Quadrant 
IV data a somewhat different method of collecting data must be 
employed. This method may be called the Method of Similarities. In 
this method the individual is presented with the stimuli three at a 
time as in the Method of Triads. The instructions, however, are to 
judge which pair of the three is most alike and which pair is least 
alike. The instructions may or may not indicate an explicit attribute 
—i.e., the instructions may be to judge which pair of a triad of 
cultures is more alike in their ethics, or the instructions may be 
merely to judge which pair is more alike. In cither case the analysis 
of the data follows the same procedure. With the attribute explicit, 
however, the Method of Similarities will yield the latent structure 
underlying the stimuli for this attribute as perceived by an individ* 
ual. With no explicit attribute, the method will yield the latent 
structure underlying the similarities and differences of stimuli as 
perceived by an individual. 

This technique was used by Richardson as a method of collect- 
ing data in his two-dimensional study of color (27). The method of 
analysis, however, was a carry-over of Thurstone's Law of Com 
parative Judgment. The method suggested here for analyzing these 
data is quite different-namely, the Unfolding Technique. This 
difference in method of analysis is a direct consequence of the 
definition of the information contained in the data. To apply the 
Unfolding Technique to such data, the information they contain 
is obtained as follows. The responses of the individual to each 
such triad may be converted into three paired comparisons, each an 
element of a different I scale. For example, suppose an individual 


responded to the triad 



with the statements that B and C 
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thereby implied. The methods of analysis designed for such data 
I include the systems of Guiiman (16), Lazarsfeld (20), and Test 
Theory (15), Their relations and distinctions will be brought out in 
the next section. 


METHODS OF ANALYZING DATA 

Paralleling the organization of the preceding section, we shall 
consider methods of analyzing data for the various methods of 
collecting data on the basis of their definition of the information 
contained in the data. 

The methods of analyzing data appropriate to Quadrant I 
behavior, which include the Parallelogram Technique and the 
Unfolding Technique, will be discussed first; these will be followed 
by the methods appropriate to Quadrant IV behavior, which include 
the Law of Comparative Judgment and its derivatives and the 
extension of the Unfolding Technique to the Method of Similar- 
ities; finally Che discussion will consider the methods appropriate 
to Quadrant II and 111, which in the most general case is Lazars- 
feld's system of theories, within which Gunman’s theory and Test 
Theory are special cases. 


Analysis of Quadrant I Data 

THE PARALLELOGRAM TECHNIQUE. Parallelogram Analysis is spe- 
cifically designed for analysis of data collected by one of the meth^s 
d*isnr ' rr-C ” ro -^ick Z " A matrix is constructerf wiVA itermr as- 
columns and individuals as rows. If an individual endorses an item, 
an entry, X, is made in the cell at the intersection of the respective 
arrays. The rows and columns of this matrix are permuted in an 
attempt to collect the cells with entries in them in a diagonal band 
from top left to bottom right such that the entries in every row 
and in every column are adjacent. If this can be accomplished with 
an indecomposable matrix,'® the behavior of the individuals can 
be described by a simple order in which the order of the rows and 

I® An indecomposable matrix has the property that there exists an arrange- 
ment of rows and columns such that the entries in every pair of successUe 
TOWS have entries in at least one common column. 
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item; a judge evaluating a subject by means o£ a rating scale. 
From the point of view of the information in the response, he may 
be said to be taking as his ideal his notion of the subject’s ideal, 
and from that position he is picking one of the alternatives on the 
rating scale as being the stimulus nearest this ideal. When the ideal 
which the judge takes is actually hb own ideal, task A explicitly, 
the behavior b classifiable as Quadrant llo. But in terms of the 
information in each response, there is no difference between Quad- 
rant II and Quadrant III; they are both “Pick 1“ among a set of 
alternatives being evaluated with respect to an ideal. 

Thus, monotone questionnaire or attitude items, typically of 
the Likert type (21) with alternatives running from "strongly agree" 
to "strongly disagree," are simply rating scales where the data are 
collected by "Pick 1." This type of data is tolerated even though 
it contains no information on the order of the alternatives because 
the experimenter has an a priori order which he regards as uni- 
venally acceptable. Thb comtitutes a generalization of the concept 
of "right answer.” As far as the data are concerned there are nPn 
possible simple orders, as will be seen in the next section, but this 
does not matter to the experimenter because he knows which one 
b “right." If the data were collected by any of the other methods in 
the searchingness structure, tests for simple order of the alternatives 
and for common metric relations among the alternatives would be 
possible. 

Thb way of looking at the Method of Single Stimuli b from 
the point of view of each item separately. Each item is regarded as 
representing an attribute, and its alternatives are the stimuli on 
this attribute among which the judge "picks 1." When there are 
only two alternatives to each item (endorse-not endorse, pass-fail, 
agrec-disagree, yes-no, etc), there is another way of looking at the 
Method of Single Stimuli— from the point of view of the group of 
iienw taken as a whole. In effect, the experimenter b regarding 
the items themselves, not their alternatives, as the stimuli; then the 
Method of Single Stimuli corresponds to "Pick any,” with no con- 
straint as in the Method of Choice. 

There b no fundamental distinction between these two ways of 
liking at the Method of Single Stimuli. The fundamental dis- 
tinctions that exbt reside in the method of analysis selected by the 
experimenter and the implicit theory about the behavior that is 
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Fio. 9. Unfolding analysis of "Order 2" data. 


The permuting of rows and columns is j'jter. 

> matrix is obtained which meets certain condition 
•nmed that no such matrix exists. If the "I™";, , he 

« far as the power of that method of collecting a y^j^riying 
behavior may be regarded as a consequence of a sing T ^ 

'«ent attribute on which the stimuli have been at - V 

"dered: (2) the judges have been placed in equ ™Ie"“ 
corresponding to their equivalent phenotypic .. j ^ ,vill 
d^ses have Len ordered. Exactly Order 3" 

made explicit after a discussion of the ana y 

. the classes of phenotypic common latent 

Order 3" for five stimuli when there is a ® ^j^pigure?. 
attribute underlying the phenotypic behavior, i entries 

Por the analysis of such data, a matrix is ^onstru ^ 

2, and 3 in each row indicating an m rows and 

third choice. The analysis then consists of permuting 
colu^s to form a parallelogram, as possess, under 

The characteristics which such a m .. underlying the 
condition, of a single common latent attribute 
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the order ot the columns represent the ordinal positions of indi- 
viduals and items on a latent attribute. If such a pattern is not 
obtained, it may reflect error, multidimensionality, or unrepre- 
sentative sampling. 

An illustration of this form of analysis is given in Figure 8 corre- 
sponding to the data collected by "Pick 2” illustrated in Figure 2. 
The rows of Figure 8 each represent one individual from each class 
of phenotypic behavior. 



Fig 8 Parallelogram analysis of "Pick 2" data. 

A significant aspect of this type of analysis is that it is com- 
pletely subordinate to the data. It does not map the data into a 
unidimeiuional continuum but, in effect, asks the data whether 
some of the conditions for a unidimensional continuum are satisfied. 
As a consequence, the technique of Parallelogram Analysis does 
not necessarily yield a simple order. It is a weak system of analysis 
in the sense discussed earlier, 

THE UNFOLDING TECHNIQUE. A change in the method of analy- 
sis occurs when we reach the "Order 2” method of collecting data. 
Here the cell entry, previously an X, is either a 1 or a 2, to represent 
an individual’s first or second choice. The analysis then consists of 
permuting rows and columns to form a parallelogram as in the 
analysis of "Pick k" data but in which the entries are digits, as in 
Figure 9. The data analyzed in Figure 9 correspond to those illus- 
trated in Figure 6. 

The results of such an analysis are identical to those of "Pick 2 
except that with adequate sampling there are twice as many classes 
of individuals. 
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choices o£ the individuals, may now be made explicit tor the general 

case of "Order k” 

(1) The entries in each row (the integers' from 1 to k) and in each 
column must be adjacent with no blanks. 

(2) The entries in the first row and the first column' must raonoton* 
ically increase from left to right and from top to bottom, re- 
spectively. 

(3) The entries in the last row and the last column must monoton- 
ically decrease from left to right and from top to bottom, re- 
spectively. 

(4) The entries in all other columns must monotonically decrease 
and then increase from top to bottom. 
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Fic. 10. Unfolding anal^is of "Order 3” data. 

It is now evident in what sense the Parallelogram Technique 
for the analysis of “Pick k” is a special case of the Unfolding Tech- 
nique for "Order k." If x'» were substituted for the integers in 
Figure 10, the data would correspond to "Pick 3" instead of "Order 
5 and discriminations among certain classes of individuals would 
vanish. In this instance, the first four classes would collapse into 
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one class, the next two would collapse, and the last four classes 
would collapse, leaving only three classes of individuals, correspond- 
ing to Figure 4. 

This comparison reveals ihe difference in the “searchingness” 
of these methods of collecting data. If an individual under “Order 3" 
had chosen C, and B as his first, second, and third choices, the 
Unfolding Technique would reveal him as an “aberrant” individual 
in the sense Uiat the latent attribute underlying his preferences is 
not the same as for the other individuals. Under “Pick 3,” this 
individual would be indistinguishable from the others in the first 
four classes that were collapsed together. The weaker condition 
for unidimensionality would be satisfied under “Pick 3,” but the 
stronger condition under “Order 3“ would not be satisfied. If such 
behavior were obtained, the social scientist would then be faced 
with the dilemma of whether to regard this individual as “in error” 
in his judgment or not. Data collected at the power level of 
"Order 3” do not contSim such information. 

The analysis presented in Figure 10 has some further implica* 
lions with respect to the genotypic structure underlying the prefer* 
ences of individuals. These are In regard to the metric relations of the 
distances between stimuli on the J scale, as was pointed out in the 
preceding section. 

This form of analyst, the Unfolding Technique, is applicable 
for all methods of collecting data in Quadrant I of the form 
"Order A,” 2 A ^ n-I. Ultimately for the Method of Rank Order, 
corresponding to “Order n-l,“ there is an integer in every cell of 
the matrix. 

Throughout all of these methods the integrity of the data is 
maintained. No ordered metric or even simple order is necessarily 
obtained unless the data satisfy the required conditions. In the 
domaii' of social-psychological variables, the data will usually not 
satisfy these conditions and^gain the dilemma of the social scientist 
arises. Either he has to choose between his theory and his data if 
he believes or insists that an interval scale or a simple order holds 
or he has to use a method of collecting data ivhich permits a dis- 
tinction to be made between inconsistency and imr.insitivity in the 
judgments of each individual. 

The Unfolding Technique is a method of discovering and 



518 


The Analysis of Data 

isolating a latent attribute underlying the preferences of a group 
of individuals or, in different terms « a method for discovering 

genotypic scale The latent attribute. ^oint 

In ordered metric with both stimuli and individuals on the Joint 
scale This is the only method available at present for 
choice behavior for latent attributes The ^ 

extended to unfolding in multidimensional space (1) When th 
done, it will permit the analysis of preferences into two or more 
latent attributes— an analogue of multiple factor analysis 

The handling of data collected by the Method of Paired Com 
parisons is more involved unless the paired comparisons are transi 
live for each individual In tfie latter case, the data may. of course 
be converted to rank orders^and analyzed as above The theory or 
the analysis of intransitive paired comparisons has not yet been 
completed and there are several alternative psychological models 
all of which must be developed Similarly the treatment of data 
collected by the Method of Triads, using no stronger psychological 
postulate than that given on page 497. is somewhat more involve 
because at this level of data collection the data contain information 
on random error within individuals These further developments 


cannot profitably be pursued here '' 

The Unfolding Analysis of Quadrant 1 behavior collected by 
the Method of Successive Intervals or the Method of Equal Appear 
mg Intervals requires a simple modification m procedure In any 
row the digit representing the ordinal rank of a pile will be asso 
ciated with as many columns as there were stimuli placed in that 
pile by that individual The process of analysis is essentially similar 
to that given for the Order k methods of collecting data except 
that there is a certain relaxing of the conditions that the final matrix 
must satisfy This is particularly true of the Method of Successive 
Intervals in which the piles have no built in metric relations 

The discussion of the methods of analyzing data collected by 
the Method of Choice task A, has all been in terms of a weak 
system of analysis in which the data are regarded as paramount and 
a unidimensional analysis is obtained only if the data satisfy the 
necessary conditions In general, as one goes down the scale of 
searchingness to methods of collecting data which are less searching 


11 For a more detailed treatment see Coombs (8 Chap 7) 
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assumptions are substituted for data. These assumptions, in a gen- 
eral sense, are those necessary to make the data not collected com- 
patible with the data collected. 

It was mentioned earlier that the methods of collecting data 
in Quadrant I constituted criteria for unidimensionality of a latent 
attribute. It may now be seen more clearly in what sense this is so. 
The techniques from “Pick «-I” up to and including “Order 2” are 
increasingly stringent criteria for a simple order. From “Order 3“ 
up to and including “Order n-1," the Method of Rank Order, these 
criteria successively demand not only the same simple order but 
become more and more sensitive to metric relations. Thus, indi- 
viduals may have the same simple order for a set of stimuli but 
different metric relations on them. Such individuals could not be 
placed on a common continuum without violating data. 

The methods of analyzing the data of Quadrant I discussed up 
to this point are methods for discovering latent attributes (called 
genotypic scales) underlying preferences. These same data, however, 
may be used to construct phenotypic scales by any one of several 
systems of analysis developed by Thurstone. A phenotypic scale is 
a scale which best “represents’' the data but does not "derive” them— 
that is, it does not in some sense go "behind the data and draw 
inferences of a genotypic or explanatory nature but rather attempts 
to provide a simple description of all the data. , 

Methods of analysis for arriving at phenotypic scales for Quad- 
rant I data include the systems of analysis specifically designed for 
each' of the following methods of collecting data: the Method of 
Equal Appearing Intervals, the Method of Successive Intervals, the 
Method of Rank Order (34), and the Method of Paired Compari- 
sons. The Method of Paired Comparisons is the most general of 
these methods, and Thurstone has developed the Law of Compara- 
tive Judgment for the construction of scales from such data. 

These methods of analysis were designed by Thurstone for tlic 
scaling of stimuli with respect to an attribute. Quadrant IV be- 
havior, and will be discussed below. They may, however, be applied 
to analyze Quadrant I behavior, appropriately collected. The rela- 
tionship of such an analysis to an analysis of the same data by the 
Unfolding Tedinique will be pointed out below under the discus- 
sion of Group Scales. 
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^7iai))«5 of Quadrant IV Data 

THURSTONFS SCALING METHODS As pieviously indicated rela 

live behavior involves a choice beuveen stimuli In the J 

subsection the choice involved reference to an ideal 
hence the behavior renccts the relative prelerences ot an ndividiia 
In this section the behavior observed will be choice betavior but 
with relerence to an attribute (Quadrant IV ot Fig 1) Hence the 
behavior rehects the individuals judgments on the relative mag 


tudes o£ two or more stimuli m some respect 

The immediate objerti\e of collecting such data is to stu y 
stimuli Usually this involves using the data to determine ^ 

tive scale positions of the stimuli with respect to some attribute 
The most usual methods of collecting data to achieve this purpose 
are those of Thurstone the Method of Paired Comparisons the 
Method of Rank Order the Method of Successive Intervals and 


the Method of Equal Appearing Intervals 

The procedures followed to construct a scale from the data 
collected by any of these methods are well known and available in 
the literature (14) and will not be repeated here These are a 
strong systems tshich assume certain properties to hold for t e 
information in the data and the analysis )ields an interval scale 
ssith the stimuli positioned on it The usual application of interest 
to the social scientist Case \ of the Law of Comparaiise Judgment 


applied to data collected by the Method of Paired Comparisons 
assumes that different individuals are replications of one another 
for the same stimulus situation Stimuli must be used which are 


relatively indiscnminable so that there is some disagreement be 


tween judges It is assumed that this disagreement is due to a given 
stimulus giving rise to a distribution of sensation magnitudes As 
summg that this distribution is normal and that the variability of 
the distribution of differences between pairs of such distributions 
is constant for all j airs (Case V) this latter \ariabilil) is then used 
as the source for a unit of measurement and an interval scale mav 


be built 

The methods of analysis designed for d iia tollected b) each 
of the other raeiiiods— the Method of Rank Order the Method of 
Successive Intervals and the Method of Equal \ppcanng Intervals— 
are essentially specid cases of the Law of Comparative Judgment 
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and are listed in the order in ^hich a successive!) increasing numbei 
of assumptions are made or an increasing number of properties are 
imposed on the behavior by the method of observing it These 
methods will always yield an interval scale unless the behavior bas 
no error (i e , all individuals agree on every judgment) or consists 
entirely of error (le, individuals split 50 50 on every judgment) 
It IS of interest to note in passing that the methods of Thurstone 
require relatively indiscriminable stimuli whereas t!ie Unfolding 
Technique is much more suitable for completely discnminable than 
for relatively indiscriminable stimuli 

These methods of TJiurstone are commonly employed to con 
struct an attitude scale with statements of opinion ramming from 
pro to anti The scale so constructed has tne stimuli located on it 
but not the judges One then assumes that the scale obtained holds 
for all the judges or for a different group of individuals and it is 
rcadminisicred under a method appropriate to Quadrant I or 11 
to locate the individuals on the scale Two experimental operations 
for collecting data are required to yield a Joint distribution and 
the stimuli must be relatively indiscriminable m order that there 
be some error variance to yield a unit of measurement 

No better methods have been devised for constructing an inter 
val scale for measuring attitudes The Unfolding Technique applied 
to such behavior may yield the joint distribution in one expen 
mental operation but it will be at best an ordered metric and not 
an interval scale Furthermore, the present writers expenence has 
shown that it is much more hkcly to imply that no sucli yardstick 
exists If, for reasons of belief or convenience, one requires that a 
soaal psychological attribute be measured on a straight line by use 
of the real numbers it appears lo be necessary to use techniques of 
observation and analysis which embody sufficient assumptions and 
classify sufficient data as error to ensure such a result 

■niE METHOD OF SIMILARITIES In the preceding section on 
methods of collecting data, the Method of Simtlaruies vvas presented 
as a method by whicli the Unfolding Tccliniquc could be extended 
to Quadrant IV behavior On the basis of the information coniainetl 
m such data, described in the preceding section the judgments of 
the individual may be converted into paircd<omparison judgments 
vsiih respect to an ideal in which the individual is regarded as 
taking each stimulus in turn as las ideal If the paircd<omparisan 
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judgments are transitive for each .deal, then 
s obtained. Such data may be analysed as described for Order A 
in which k = n-1, each row of the matrix representing the rank 
order from a given ideal. The entire matrix then represents the 
behavior of a single individual over the entire stimulus space. This 
technique permits a relatively rigorous and intensive study of a 


single individual. . , 

THE CROUP SCALE, It has been pointed out several times that 
data collected by certain methods in Quadrant I {e.g., the M^hod 
of Paired Comparisons) may be analyzed by the Unfolding Tech- 
nique or by the Law of Comparative Judgment. The relationship 
between these forms of analysis will now be shown. 

If the Method of Paired Comparisons is used to collect data in 
Quadrant I, the judgments of individuals represent preferences on 
each pair of stimuli. If the Law of Comparative Judgment is used to 
analyze such paired<omparison data» the result is a scale which in 
some statistical sense (25) is most descriptive of the preferences of 
the individuals taken as a group. The application of such a tech- 
nique involves regarding preference as an attribute of stimuli (e.g-, 
preferability or popularity) and, in effect, the experimenter is re- 
garding such data as evaluation of stimuli with respect to an attri- 
bute and placing it in Quadrant IV instead of Quadrant 1. 

Application of the Unfolding Technique to the same data may 
yield, if the appropriate conditions are satisfied, a Joint scale on 
which the stimuli and the individual judges are located instead of 
just the stimuli. The unfolded J scale is a genotypic scale repre- 
senting an inferred latent attribute underlying the preferences of 
the individuals. The Law of Comparative Judgment solution is a 
phenotypic scale descriptive of the preferences. 

In the more formal development of this general scaling theory, 
the individual’s I scale, which is observed at best as a rank order, 
may hypothetically be regarded as derived from a ratio scale of 
“preferability” for each individual. Let us define a Group scale as 
a mean of all the I scales. In the special case in which the I scales 
arise from a common Joint scale, a Group scale is a Joint scale 
folded in the middle. It can then be shown that a Law of Compara- 
tive Judgment solution to preference data represents an approxima- 
tion to such a Group scale. These theoretical relations have been 
tested in several experiments and have been borne out. Thus, in 
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the case in which a single latent attribute underlies the phenotypic 
preferences of individuals, the Unfolding Teclinique yields a Joint 
scale with individuals and stimuli located on it; the Law of Com- 
parative Judgment solution represents the same scale, folded approx- 
imately in the middle, with only the stimuli remaining on it. 

This discussion of Group scales has had explicit reference to 
preferential judgments collected by the Method of Paired Compari- 
sons and analyzed by the Law of Comparative Judgment as well as 
by the Unfolding Technique. It should be apparent that classifying 
data collected by the Method of Rank Order, the Method of Suc- 
cessive Intervals, and the Method of Equal Appearing Intervals in 
Quadrant IV and analyzing such data by the appropriate methods 
simply represent different approximations to the Group scale. Data 
collected by any of these latter methods may also be analyzed by 
the corresponding case of the Unfolding Technique and a unidi- 
mensional Joint scale obtained under the appropriate conditions. 

The theory and the computational analysis of data collected 
by the Method of Paired Comparisons, the Method of Rank Order, 
the Method of Successive Intervals, or the Method of Equal Appear- 
ing Intervals and analyzed by their appropriate Quadrant IV method 
of analysis make no distinction between task A and task B. For these 
methods, all behavior is task B. Only certain of these data, however, 
may be classified in Quadrant I and unfolded, and that is when 
these methods are used to observe task /i behavior, the evaluation 
of stimuli with respect to an ideal. 


Analysis of Quadrant II and III Data 

As was pointed out in the preceding section, the data associated 
with these two quadrants are those collected by the Method of 
Single Stimuli. It was further pointed out that such data (the experi- 
mentally independent successive responses to a number of items by 
an individual choosing one alternative as his response to each item) 
constituted a special case, “Pick 1/* of the Method of Choice. 

From the point of view of the information contained in such 
data, two types of items were distinguished; monotone items, asso- 
ciated with Quadrant Ila and III, and nonmonotone items, associ- 
ated with Quadrant Jib. The information contained in monotone 
items, in the particular case of two alternatives, is whether the ideal 
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of the individual is greater or less thin the position of the item on 
the eenotypic scale 

In the case o£ the nonmonotone items the information co 
tamed m the data is avhich of the alternatives to an item is nearer 
the individual s ideal m any direction, ratlier than on one side ol 
him, as in the case of monotone items Consequently, if the en 
dorse alternative is 'too far away' in any direction, the phenotypic 
behavior is "not endorse,’ and hence there may be several distinct 
kinds of genotypic individuals responding identically phenotypically 

The kinds of data usually classified m Quadrants Ila or III 
include mental test data, attitude or questionnaire items v. ith Likert 
type alternatives, items of a ‘cumulative nature, and rating scale 
data The kinds of data usually classified in Quadrant life include 
statements of opinion ranging from pro to anti, administered by 
the Method of Single Stimuli with the alternatives ' agree ’ or ‘ not 


agree 

Lazarsfeld is in the process of constructing two related svstems 
for the analysis of Method of Single Stimuli data These systems are 
his Latent Distance Model and Latent Structure Analysis These 
systems are complex and as yet relatively undeveloped so in many 
instances they are not practicable Although these systems are not 
computationally feasible this constitutes only a transient difficulty 
Lazarsfeld s system is actually a theory of theories, a metatheory, 
of methods of analyzing irrelative behavior Lazarsfeld s system is of 
such generality that it provides a theoretical framework within 
winch specific methods of analyzing data collected by the Method 
of Single Stimuli may be understood as special cases Both Guitman s 
scaling theory and mental test theory will be presented in this 
context 


Viewing various methods of analysis as special cases of a more 
general theory reveals the fact that an experimenter in selecting a 
method of analysis is selecting a theory about behavior The data 
are either asked to satisfy or forced to satisfy the theory, depending 
on the strength or the postulational basis of the specific theory 


12 See for example StotilTcr (31 p 141) 

13 This (hstinction ts an arbitrirv one Insed on the use of disconUnuous 
trace lines in the Latent Distance Model and continuous trace lines m Latent 
Structure Analysis The distinction u convenient here because GuUnians scaling 
theory is a special case of the first md mental test theory a special case of the 
second 
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underlying the method of analysis selected. The common immediate 
objective of these techniques is to convert the information in the 
data to positions on an ordinal or interval scale. The techniques 
of analysis differ only in the properties or constraints they impose 
on the information in the data; the techniques range from very 
weak systems, which make the least assumptions and may not even 
yield a simply ordered scale, to very strong systems, which make 
sufficient assumptions to guarantee an interval scale. 

lazarsfeld's latent distance model. This model is specifi 
cally designed for the analysis of Quadrant IIo and Quadrant III 
data. In the simplest form of this model, the attribute continuum is 
assumed to be dichotomized at some point by an item such that all 
individuals on one side of that point have a probability py of en- 
dorsing or passing the item j and all individuals on the other side 


Probability of 
the response pass, 
agree, endorse, 
yes, etc. 


AHRIBUTE CONTINUUM 

Fig. ]]. Example of (a) Guttman-type item and (6) Lazarsfeld- 
type item. 

of that point have the probability i — pj of endorsing or passing the 
item (cf. Fig. 11). This model is readily generalizable to more than 
two latent classes for other than dichotomous items. 

It is immediately apparent that Gunman’s scalogram technique 
is a special case of the dichotomous item. The Guttman model re- 
quires that pj be z^o or one {cf. Fig, 11). 

In different terms, Guttraan's system of analysis requires per- 
fectly reliable items (perfect internal consistency), whereas Lazars- 
feld can handle less than perfectly reliable items. By permitting 
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error or inconsistency in behavior, Lazarsfeld’s system can yield a 
solution when Guttman's cannot. Lazarsfeld’s solution provides a 
set of at least two latent classes on the genotypic level and a proba- 
bility that each response pattern is associated with eacli latent class. 
When the data satisfy the conditions for a simply ordered scale, 
Lazarsfeld's system reduces to Guttman’s in that the probability of 
a specific response pattern’s being associated with a specific seg- 
ment of the hypothetical continuum is zero or one. 

The Guttman scalogram technique is used to analyze Quadrant 
lla data to determine whether the conditions for a simply ordered 
scale are satisfied. The method is closely related to the Parallelogram 
Technique in that a matrix is constructed in exactly the same man- 
ner and the rows and the columns are permuted. But inasmuch as 
these items are monotone items instead of nonmonotone, the final 
matrix, if ordinaliiy is satisfied, has all the X’s on one side of the 
diagonal (including the diagonal) and all the blank cells on the 
other side of the diagonal. For this reason this method of analysis 
may be called Triangular Analysis. 

Guttman’s scaling theory constitutes a strict adherence to the 
logical structure of an ordinal scale. If all the data do not satisfy the 
conditions, he calls the result either a quasi-scale or nonscale type. 
The latter is really a partial order. If an ordinal scale is insisted 
upon, part of the data must be rejected— either individuals, stimuli, 
or both. Otherwise the latent distance model, which can classify some 
of the data as error and still yield an ordinal scale, must be used. 

The result of a Guttman analysis on data which satisfy the 
necessary conditions is an ordinal scale with the stimuli and the 
response patterns of the individuals simply ordered. The technique 
as conventionally used is applied to dichotomous items. If the items 
contain more alternatives than two, the surest way to find an ordinal 
scale is to group the alternatives to form that dichotomy which best 
happens to satisfy the conditions for an ordinal scale. This procedure 
takes advantage of every error, random or biased, in the data but has 
commonly been followed because of the generally unsatisfactory 
results of trying to scale all the alternatives. 

It is possible to show that the scaling of the alternatives for 
different items^y Guttman’s technique requires an additional con- 
dition above the ordinal property of the scale. It is not feasible 
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to develop this here, but the general principle is that it is not 
the alternatives which should be scaled but the section line between 
two adjacent alternatives of an item The alternatives may them 
selves appear to be scaled only it the alternatives of each item mesh, 
like the teeth in gears, with the alternatives of the other items This 
condition can be called "orderly interlocking" If this condition 
does not hold, the conditions for an ordinal scale may exist but no 
ordinal scale will be found in the data unless the boundaries of the 
alternatives are scaled instead of the alternatives themselves Thus, 
an Item with five alternatives has four scale positions, one between 
alternative a and h, one between b and e, c and d, and d and e In 
this manner it is also possible to handle, in one scale analysis tjues 
tionnaire items which have varying numbers of alternatives 

It has previously been pointed out that every item in the Method 
of Single Stimuli, task A, monotone stimuli, is really a disguised 
rating scale The experimenter must select an a prion order among 
the alternatives to the item The individual, in responding simply 
informs the experimenter of his location on the rating scale Methods 
of analyzing Quadrant IIo data are consequently simply methods 
of analyzing the task A responses of a number of individuals to a 
number of rating scales Hence, from the abstract point of view of 
measurement theory, rating scale data can be mapped into Quadrant 
Ila and hence the methods of Guttman and Lazarsfeld are applicable 
to the analysis of rating scale data 

There is one characteristic of Guttman s scaling technique which 
IS either an advantage or a disadvantage depending on ones poinl 
of view If the data do not meet the necessary conditions for an 
ordinal scale, one will not be oblained This is a good characteristic 
if one IS interested in the study of behavior of individuals and the 
nature of attributes, for the technique does not force a more power 
ful system on the data than the data satisfy On the other hand, if an 
ordinal scale is demanded, this characteristic is a disadvantage In 
the latter case, if an ordinal scale is not found, the alternatives are 
to reject individuals or stimuli, or change the responses of some 
individuals to some items— i e , classify them as error 

Guttman, in his general theory makes much of a theory of com^ 
ponents winch are successive sources of the variance of individuals 
behavior He particularly makes much of the psychological interpre 



528 The Analysis of Dafa 

tat.ons which he gives to the first and second components The first 
component is the simply ordered scale of stimuli and individuals 
The second component is a U shaped function which he interprets 
as intensity He suggests that individuals at the extremes of the 
attitude continuum feel more strongly about their attitude than 
those in the middle region, in the neighborhood of indifference This 
was an early suggestion of Katz (18) and is a reasonable psychological 
hypothesis But the interpretation of the second component as a 
measure of this intensity of feeling must be very carefully validated 
experimentally As indicated by the class of data to which Guttman s 
theory applies mathematical analysts into components is as valid 
for mental test behavior as for attitude data collected on monotone 
Items These same mathematical components would be found m the 
behavior of individuals on an arithmetic test for example, but the 
same psychological interpretation of intensity of feeling about one s 
arithmetic ability would not necessarily follow The interpretation 
to be given components may be different for different domains of 
behavior 

LAZARSFELD s LATENT STRUCTURE ANALYSIS Instead of regarding 
the underlying attribute as having discrete steps or classes, in Lazars 
feld s Latent Structure Analysis (20) the underlying theory is that 
of a continuous gradation in the latent attributes The analysis as in 
the latent distance model yields discrete latent classes, but the basic 
psychological theory requires an underlying continuum Latent struc 
ture analysis ^s Lazarsfeld s general system for the analysis of non 
monotone items. Quadrant 116 data, but in which monotone items 
are a special case It is in this sense that Lazarsfeld s general theory 
IS a theory of theories for irrelative, task. A, behavior In principle 
his theory is applicable not only to monotone and nonmonotone 
Items but also to combinations of them in the same test or ques 
tionnaire 

Latent Structure Analysis has the basic concept of a trace line 
associated with each item which represents the probability of an 
individual at any point on the latent attitude continuum responding 
favorably to the given item (Fig 12) 

In the early stages of the development of the theory, the selec 
tion of a function (straight lines parabola etc) for the trace line 
was an a prion decision but there is an expectation that this will 
become objective and unique (12) This system is applicable to 
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Fig 12 lUustratson of monotone (a b, c) trace lines and non 
monotone (d) trace lines in Latarsfelds Latent Structure Theory 


monotone items by the use of a trace line which is a conimiious 
monotonic function of a latent attribute continuum— e a straight 
line 

This concept of a trace line s applicable to nonmonoione 
Items by use of polynomial trace functions (13) The trace function 
may be assumed to be (a) a polynomial in several variables, (b) a 
linear combination of several variables or (c) i polynomial in a 
single vanabfe Thus an uem in which two extreme groups on a 
single attitude vinable respond alike nml the inic’’mciliaie group 
diffcrentlj uoitlcl correspond to i panbolic trite hnc Dus gcnenl 
jration pciniits iny of ^n mfiniic vmciy of tnct lines to be assumed 
11) virtue of a separate trice associated with c irh item it is possible 
to apply Latent Structure Analysis at least m principle to data 
obtained bs ihi Method of Single Stiimili ftom i mixture of mono 
tone ind nonnum none uems In nicntil t« tint, tlit items are 
<lich«fotnoiis— p f 111 — UikI die pairtrns <if iti|Kmsc arc nnppt d into 

tilt limbers by tountinjj all the Iivoiiblt {passing) elements in a 
rcsjKuist. j> ittern 1 Ims t imtcntially gnu \autiv of diffcrtni 
( spouse jiatierns art all mapped on the saint intcgtr I or example 
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a score of two could be obtained by passing any two items PPrF, 
PPPP, PPPr, PPPP. etc 

If the conditions for a Guttman scale hold a one to one cor 
respondence exists beti/een a response pattern and an integer 
corresponding to an ordinal position on a unidimensional con 
tinuum To convert such a scale into an interval scale an assumption 
must be made such as that the increments m ability beti^een the 
ordinal positions of adjacent items on the scale are equal 

When conditions for a Guttman scale are not met theie is more 
than one response pattern mapped onto the same integer and the 
scale IS represented by a partial order what Guttman calls a non 
scale type The rationale for mapping such a partial order on an 
interval scale is somewhat more involved since it is based on a 
concept of error and random fluctuations 

The concept of error commonly used in test theory corresponds 
to the assumption that the trace line of a mental test item is the 
integral of a normal curve (c/ trace line c in Figure 12) although 
Carroll (5) is currently working on the theory of a linear trace line 
for mental tests If the items were perfectly reliable the variance 
of the normal curve would be zero and the trace line would take 
the form of that shown for item a in Figure 11 The rationale 
underlying test theory is critically evaluated by Reese (26) and 
Thomas (32) 

Likert (21) has suggested a technique for arriving at a score on 
an attitude questionnaire which is a simple extension of mental test 
mg technique Likert s system is necessarily conflned to monotone 
Items but is an extension of the scoring methods of mental testing 
in that degrees of endorsement are obtained in the data The 
technique is to take a statement of opinion sufficiently extreme so 
that It cannot act nonmonotonically— that is so that there are not 
likely to be any people so extreme that they would reject the item 
for not being sufficiently extreme To such an item a degree of 
endorsement is obtained typically from strongly agree to strongly 
disagree The simplified scoring method is to map these alternatives 
into the integers— e g 1 to 5— so that 1 represents the extreme pro 
answer and 5 the extreme anti The individuals score then may 
be simply the sum of those integers into which his responses have 
been mapped Obviously this corresponds identically to scoring a 
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mental test except that in the latter the response categories fail 
and pass, are mapped into 0 and I, respectively 

Likert discusses a more ‘ refined scoring procedure called the 
sigma method of scoring The assumption is that the distribution 
on the underlying attribute is normal The percentage of individuals 
in a given category is converted into a sigma value and that I'alue is 
used instead of the integers in the simplified scoring sjstem Afappmg 
the response categories into the real numbers like this instead of into 
the integers has been tried in mental test techniques also but in this 
area has not proved ivorthwhile particularly in view of the labor 
involved compared with the simple weights 

This problem of arriving at scale scores on an interval scale has 
conventionally been treated primarily as an empirical problem and 
by means of statistical criteria— eg, reliability, validity, etc As indi 
cated earlier, the level of measurement into which the data are cast 
IS an intrinsic part of any theory about the data If a given set of 
data should actually satisfy only a partial order, it can always be 
mapped into an interval scale by defining a constant and common 
unit of measurement Such a scale will usually exhibit a significant 
degree of reliability m the sense of a sufficiently high ratio of the 
variability between individual $ scale positions to variability within 
In such a circumstance, statistically significant relations can be 
obtained with other simifarl) defined variables and hence these pro 
cedures have tremendously practical value for applied problems 
and lead to an effective actuarial science 

Lazarsfeld s s)siem is also a generalization of Thurstone s multi 
pie factor analysis (35) m that it is not confined to unidimensional 
analyses and permits other assumptions to be made than that of 
linear summation, whidi underlies factor anaijsis Lazarsfeld makes 
use of a generalized concept of partial correlation between items 
whereas multiple factor anal}sis makes use of only the zero order 
correlations It is this element of Lazarsfeld s system whidi multiplier 
the computational complcxii) because the system, in principle, may 
make use of pariials of high order 

In this discussion of Quadrants II and III, it Ins been pointetl 
out that Laranfeld s general thcor) is actually a vancij of ilicones 
about irrelative behavior and that specific systems of anal)Si$ de 
signed ad hoc for particular objectives are speaal cases of this more 
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general thcon The atUmtage of an abstract approach to methods 
of collecting and methods of analyzing data, from the point of viw 
of measurement theory, lies in its permitting the comparison of till 
ferent techniques and their interrelations it also permits general 


izing their applicability 

For example, from this abstract point of \icvv, the tlaia fiom ccr 
tain learning experiments can be classified as Quatlrani lln data 
A conditioning experiment in llie abstract corresponds to a men al 
test given backwards The conditioned stimulus and the iincondi 
iioned stimulus together may be regarded as the stimulus situation 
If the response occurs after the conditioned stimulus and before 
the unconditioned stimulus, the item is passed ’, if not, it is failed 
The stimuli are presented in the order of the most difiiculi item 
first and, as conditioning progresses, the items become easier The 
analogy is now clear If one wants to use the strong system of lest 
theory, each individual s performance is mapped into the integers 
by counting trials to learn Tins requires the assumption that the 
inaement in learning between all pairs of successive trials is con 
stant within and among individuals 

In conditioning experiments the presentation of stimuli may be 
continued until the indisidual gets a certain number of them right 
in succession In mental testing the corresponding concept would 
be to require the individual to get a certain number of items xvrong 
in succession In general, any sjicctal technique designed for the 
analysis of learning data such as a sequential type of analysis would 
then be applicable in principle lo the analysis of questionnaire 
and mental test data svhich also meet the abstraci conditions of 


Method of Single Stimuli task /I momnone stimuli On the other 
hand any technique designed for the analysis of such Quadrant Ila 
data IS m principle applicable to the analysis of tlita from such 
learning experiments Hence lazirsfelds latent distance model 
or latent structure analysis could be used to study learning 

Lazarsfelds system gains much of its importance fiom the fact 
that It IS designed for the analysis of data collected by tlie Method 
of Single Stimuli and this mcthotl is probably inoic widch use<l 
particularly in social science ih m any other 7 be method is w idel\ 
used in public opinion surscys certain sonomcinc data also fall 
into this class of behasior When for example the members of a 
group are asked to indicate which individuals in the group they 
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prefer as friends, with no restriction on the number an individual 
picks, the behavior may be irrelative, task A, nonmonoione.’ If a 
restriction is placed on the number to be chosen, the behavior 
becomes relative (Method of Choice), task A, and the methods of 
analysis appropriate to Quadrant I are ’applicable. 
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Distribution-free Statistical 
Methods and the Concept of 
Power Efficiency® 


Keith Smith 


Social sciemisls today are more aware perhaps than any o 
scientists of the restrictive nature of a priori assumptions concern 
data. But although they sometimes follow tortuous rout« to a 
“common sense” assumptions which go beyond the theoretical r. 
work in which their studies arc embedded, nevertheless, new mea 
and measuring devices are constructed and applied to two groins, 
tests of the hypothetical difference between the groups are use w 
are based entirely on the assumption that the measured va ues 
distributed according to the normal curve over both pop ^ ation 
volvcd. Thus, although they arc very sensitive to assumptions a 
what might be called their *‘rcal world,” social scientists are pror^ 
be insensitive to assumptions in the statistical systems in w ic 
embed their data. The measurement or statistical systems into w, 
data are mapped constitute an integral part of the theory an assu 
tions about the “real world.” _ . , 

For example, the crux of a sophisticated theoretical 
the “real world” may lie on a base which requires * ^ 

instrument been applied to all of some universe, usua y ypo 



DisiributIon*frce Methods 537 


the distribution of measurements would have had a speafic functional 
form. Ordinarily no c\’idcncc is cs^cr gathered which might tend to 
confirm or deny this assumption. 

This gives rise to tVi*o questions: ^\Tiy is this so? ^VTiat can be done 
about it? The first question is not too difficult to answ.'er, at least in part. 
Statistics based on the normal distribution have had tremendous suc- 
cess in both the phyrical and the biolt^cal sciences. Furthermore, if the 
normafity assumption is Justified, no bothersome decisions need to be 
made concerning which test of a specific hj'pothcsis to use. The 
statistician has been able to tell us which test is “best” (in a sense to be 
discussed later). A third reason— a reason for which the statistician 
need not take all the blame— is, according to the statistician Geary (4), 
“the beauty of the mathematical theory and the facility of algebraic 
manipulation involved.*’ The soda! scientist has been all too prone to 
seek the approt’al (and gratitude) of the statistician by allossing the 
normality assumption to be made, evea. when knowledge of the subject 
matter involved Indicates that the assumption is invalid. 

In recent years mathematical statisticians have begun to con- 
struct answers to the second question, “What can be done about it?'* 
This area of statistics must be one not inextricably bound to normal 
theory. One such area is called variously “order statistics,** “non- 
parametric statistics,” or “distribution-frec” statistics. Even here not 
all assumptions concerning the mathematical form of the distributions 
under consideration have been dropped. One must still assume for most 
methods that the population is continuous. Although this is stUl a very 
strong assumption, its palatability is increased by the absence of the 
additional assumptions required by conventional or parametric 
statistics. 

Statistical procedures, both parametric and nonparamciric, fall 
into two classes corresponding to two general purposes for which the>’ 
may be used: (1) testing whether a population (5) from which one or 
more samples were drawn has a certain characteristic Is the 
population non.ial? Do the populations have the same means?), arKf (2) 
estimating some number characteristic (parameter) of the population 
represented b>’ a sample (r.^.. Within what limits docs the mean 
probably lie?}. These are ordinarily called Utt siaiitiics and estinaiiTi 
rtatiitus, respectively. 

A number of dilTerent statistical procedures have been dex-ised to 
accomplish each of these purposes. Some of the procedures are aiierna- 
ti\*es under a gix’en set of conditions. It becomes desirable, then, to be 
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data. But although they sometimes follow tortuous routes to a 
“common sense” assumptions which go beyond the theoretical fn 
work in which their studies are embedded, nevertheless, new mea 
and measuring devices are constructed and applied to two groups, ' 
tests of the hypothetical difTcrcncc between the groups are used w 
arc based entirtlji on the assumption that the measured values ' 
distributed according to the normal curve over both population 
volvcd. Thus, although they are very sensitive to assumptions a 
what might be called their “real world,” social scientisu are pror 
be insensitive to assumptions in the statistical systems in which ^ 
embed their data. The measurement or statistical systems into wl 
data are mapped constitute an integral part of the theory and assui 
tions about the “real world.” 

For example, the crux of a sophisticated theoretical question al 
the “real Vkorld” may lie on a base which requires that, had 
instrument been applied to all of some universe, usually hypotheti 
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to review briefly certain characteristics of statistical tests in general. 
In this presentation, a knowledge of elementary sampling theory will 
be assumed. 


DeJinUion of Statistical Tests 

A statistical test is a formal mechanism, based on probability, for 
arriving at a decision about the reasonableness of an assertion. The 
assertion is called a hypothesis, and any value (usually a number) 
obtained from a sample of data is a test statistic. The mechanism makes 
use of the one or more obtained values (test statistics) to arrive at a 
probability statement about the assertion. But the mechanism almost 
always makes use of more than that, and these additional things are 
other assertions about the population from which the sample was 
drawn normally distributed), the manner in which the sample 
was drawn randomly), etc. 

From the point of view of the person making use of the statistical 
test, the assertions involved arc of two kinds. One is an assertion 
directly related to the purpose of the Investigation: this is an assertion 
which is to be tested and U called a hypothesis. All the other assertions 
are those which it is necessary to assume to make a probability state- 
ment. This second set of assertions is called the model. All probability 
statements about a hypothesis arc preceded, implicitly or explicitly, by 
the qualifier “If the model used was «>rrcct, then . . . 

It should Ik clear that the weaker the assertions that define the 
model, cither by virtue of fewer assumptions or less restrictive assump- 
tions, the more general the conclusions. On the other hand, the 
stronger the model— i e., the more assumptions built into it— the more 
powerful will be the test of the hypothesis. 

As an illustration, a statistical hypothesis might be “Population A 
has a mean equal to the mean of population B.** Such a hypothesis 
contains no normality statements and, in fact, no statements about any 
characteristics of the populations other than the means. If a test of this 
hypothesis is used which contains an assumption of normality in the 
parent populations, normality becomes a part of the model and 
normality is not tested. 

Two characteristics of a statistical test have been discussed: (1) the 
model, the assertions which are assumed to be right, and (2) the 
statistical hypothesis, the assertion which is to be tested, called a null 
hypothesis. A third characteristic always required in a statistical test is 
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able to evaluate or compare procedures in order to be able to select one 
that in some sense best satisfies the experimenter’s objectives. 

The comparison of test stalisties is based on the question “How 
often will it give the right answer?” For example, if a difference exists 
between the means of two populations, will one test more often yield a 
significant difference than another test for random samples of the 
same size? The comparison of etiimathn statisliesy succinctly referred to 
as estimators, is based on two questiom: How often will it give the right 
answer, and how much information docs it give about the population 
parameter? 

The "How often” question is answered in the same way for both 
classes of statistical procedures by means of what is called the power 
curve. This will be discussed in general terms in the second section of the 
chapter. In the third section, where various nonparametric tests of 
sigiuficance are discussed, the concept of the power curve will be used 
for the purpose of comparing the nonparametric tests with conven- 
tional parametric tests. 

The second question used to evaluate estimators, "How much 
information does it give about the population parameter?” pertains to 
the size of the confidence Interval which brackets the parameter. Other 
things being equal, the shorter the confidence interval an estimator 
yields on the average, the better the estimator. The fourth section of the 
chapter wall deal with estimators; unfortimately, 'however, very little 
has yet been accomplished on the problem of comparing nonparametric 
estimators. 


CRITERIA FOR THE COMPARISON 
OF STATISTICAL TESTS 

As a consequence of the fact that alternative statistical procedures 
have been developed which can be applied to the same data, it becomes 
both necessary and desirable to provide a rational basis for choosing 
one procedure instead of another. In this section, a number of criteria 
for making this decision will be presented, with particular emphasis 
on the concepts of the power and the power efficiency of a statistical 
test. This choice of emphasis is due to the general unfamiliarity of these 
a>nccpis to social scientists and to die significant role they should play 
in the choice of a statistical test. 

Before the criteria are presented and discussed, it will be desirable 
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A sample of size 1 is drawn and, on {he basis of the model and the 
test statistic (the sample value), a choice is to be made between the 
null hypothesis and its alternative. The situation specified by the 
model is illustrated in Figure 1. > i 



Fio 1 . The distributions sp>ecificd by the alternatives //o and i/i- 

, ^Let us hold constant the probability of a Type I error at 5 percent. 
Two, statistical tests that may be used are the one-tailed test and the 
two-tailed test. The probability of a Type I error is called the sig- 
nificance level of the test. 

To use the one-tailed test, wc shall choose a number A larger 
than 1 (the mean under the null hypothesis). If our sample value is 
larger than A, we shall agree to reject the null hypothesis and, of 
course, to accept the only alternative hypothcsis-~i.r., that the mean 
ij? 2._If the sample value is less than Ay wc shall agree to accept the 
null hypothesis. A must be chosen in such a way as to make the 
probability of a ^ype I error 0.05 This will be the case if the area 
under the curve Ho to the right oT A is 0.05, and from a table of normal 
curve areas we sec that A must be 2.645. The probability of making a 
Type II error is the area under the curve Hi and to the lejt of 
i.e , the proportion of times wc will accept Ho when Hi is really true. 
Asain, from a table of normal areas, wc see that this probability is 0.74. 

To use the two-tailed test, wc choose two numbers, B and C, 
equally distant from the mean under the null hypothesis. If our sample 
Value is farther from^l than these numbers, wc aijree to reject the null 
hypothesis and accept the only alternative. If the mean is nearer 1 than 
D and C are, w'c accept the null hypothesis. The area under Ho to the 
right ^of B must be 0.025, and the area to the left of C must also be 
0.025 to make the total 0.05. And from our table 5 “ 1 -H 1.96 2.96 

and C «- 1 — 1.96 = —0 96. The probability of a Type II error is. 
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the hypothesis or class of hypotheses wWch are alternative to the null 
hypothesis. These alternative hypotheses are always part of the test, 
whether explicit or implicit. These constitute the assertion that is 
accepted if the null hypothesis is rejected. 

From this point of view, it is evident that there are two types of 
error which are possible in arriving at a decision about the null 
hypothesis’ Type /: accepting the alternative hypothesis when the null 
hypothesis is true; Type II: accepting the null hypothesis when it is 
false ‘ ' 

There is an inverse relationship between the likelihoods of making 
these two types of error. If the null hypothesis were always accepted 
regardless of the sample test statistic, the probability of a Type I error 
would be zero, but there would be a maximum likelihood of a Type II 
error. Correspondingly, if the null hypothesis were always rejected, 
the danger of a Type II error would vanish, but there would be- a 
maximum likelihood of a Type 1 error. A compromise must be reached 
between these two dangers, and various statistical tests offer the pos- 
sibility of different compromises. This is the problem for which the 
concepts of the power of a statistical test and -its power efficient 
relevant. 

Power and Power Effiaency ' 

The concepts of power and power efficiency will be introduced by 
way of an illustration. > - c 

A COMPARISON OF TWO STATisTiCAi. TESTS. A common (problcm 
is a test of a hypothesis about the value of , the mean of a population. 
The nuU hypothesis might be that the mean of the population from 
which the sample was drawn is 4. An alternative hypothesis might be 
that it b 5, and an infinite class of alternative hypotheses all used at 
once might be that it is greater than 4. 

The simplest possible case of a statistical test is^whether a popula- 
tion has one or the other of two specific values, all the other char- 
acteristics of the population having been specified in the model. Let 
the assertions that the population is normal with variance equal to 
one constitute the model. Let the null hypothesis be 
' the mean is 1. > 

Let the tltcrnative hypothesis be , * ’ ' ^ 

Hi', the mean is 2. • > 
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To show that the power of a test is determined by the alternative 
hypotheses, let us take as another example the same situation exeept 
that here the alternative hypothesis is that the mean is 3 (Fig. 2). In 
this Case the values oiA,B, and C remain the same, being detcmined 
by the null hypothesis. The area to the left of A under H[ here is 0.36, 
whereas the area under H! between B and C is 0.52. Consequently, the 
power of the one-tailed test is 1 - 0.36 = 0.64, whereas the power of 
the two-tailed test is 1 — 0.52 *= 0.48. 



One can plot, for this model and “ 5 n‘fi«nee level, die^wer of 
each test agaimt a hypothesis versus the value f g 

The curves obtain^ wUl be the “power curves” for the one-taded 

two-tailed tests (Fig. 3). 
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of course, the area between —0.96 and 2.96 under the curve Ht. This 
area is 0.83. 

Thus, although the two tests have the same amount of Type I 
error, the two-tailed test of this hypothesis has more probability of a 
Type II error, and in this sense is the poorer of the two tests against this 
alternative. In this simple example, it should be apparent that values 
to the left of C arc poor evidence for Hi. The two-tailed test is used 
here only to illustrate the fact that the rejection level of a test is not a 
sufficient basis for its evaluation. 

POWER. We arc now ready to introduce the concept of power. 
The power of a statistical test agaimt a specific set of alternative 
hypotheses at a specific significance level is given by the equation 

Power » 1 — (Probability of a Type 11 error). 

Povi’cr might alternately (but equivalently) be defined as the proba- 
bility of rejecting the null hypothesis when the alternative hypothesis 
is true. In Figure l', the power of the one-tailed test is the area to the 
right of A under the curve Hi. In tests of means from normal popula- 
tions, the concept of power is synonymous with the Fisher concept of 
"amount of information” (3). 

Research workers sometimes use certain tests because the tests 
arc "conservative,” meaning that they are less powerful than others 
that might be used. The discussion above exposes this as a rather 
peculiar sort of conservatism. Testable theoretical dcductiorts arc rare 
enough wthout loading the dice against them, but this is what the 
research worker docs if he wishes to reject the null hypothesis. Further- 
more, the "conservative” test ts in no sense conservative if the research 
worker wishes not to reject the null hypothesis (as in-'homogcncity of 
variance tests in analysis of variance). 

In our example above, the power of the one-tailcd test is 0.26, 
whereas the power of tlie two-tailed test is 0.17. Thus, the one-tailed 
test is "more powerful" than the two-tailed* test at the 5-pcrcent level 
against the hypothesis that the mean of a normal distribution with 
variance 1 is 2, when the mean under the null hypothecs is 1. So far, 
all these qualifying phrases arc necessary and illustrate the caution 
essential when statistical tests arc compared. It can be shown, in this 
simplest of cases, that, at any sigiuficance level, whenever all alternative 
hypotheses state means on one side of the null hypothesis mean, the 
one-tailed test will be more posverful than the two-tailed test. 
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To show that the power of a test is determined by the alternative 
hypotheses, let us take as another example the same situation except 
that here the alternative hypothesis is that the mean is 3 (Fig. 2) In 
this Case the values of B, and C remain the same, 
bv the null hypothesis. The area to the left of A under Hi here is 0.36, 
wCLrraunderHlbetweenBandC^ 
power of the one-tailed test is 1 - 0.36 = 0.64, whereas the power of 
the two-tailed test is 1 — 0.52 ** 0.48. 
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two-tailed tests (Fig. 3). 
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Hi', mean is 2; H*: mean is 3. Here, also, the one-taUed test is more 
pmvcrful against both alternatives, since it is more powerful against 
each singly. The difference in tests would only be that rejecting Ho 
would not imply accepting a single alternative but would imply only 
accepting the pair of alternatives. 

However, if the alternatives had been Hi’, mean is 0; and Hjt mean 
is 2, a considerable difference obtains. As Figure 3 indicates, the one- 
tailed test has almost no power (0.004) against the hypothesis that the 
mean is zero, whereas the two-tailcd test has a power of 0.17, at least, 
against both alternatives. 

A still more complex test would be one" with the infinite number 
of altemati%’es Hi: mean is greater than 1. Again, the two-tailcd test is 
less powerful than the one-tailed test, since it is less powerful for all 
altemaiivcs. 

Last, but most often used, is the test wnth the infinite class of 
alicmative h>7)othcsa: the mean u n^/ 1. As would be e.x:pected from 
Figure 3, tne two-tailcd test would be used, since it b only slightly less 
powerful on the right of the null hypothesis, but extremely more 
powerful on the left. 

POWER AND SAMPLE SIZE. The preceding illustration was entirely 
in tcmis of a sample of size one. The Ulustratioa can be generalized by 
using, instead of the sample value, the number (JTv^) /» where » is 
speaSed in the modtl as the standard error of the normal distribution 
involved, n u the sample size, and X is the sample mean. As n increases 
the test statistic, {X'\/n)/« increases and, by Figure 3, the power of 
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the lest increases also. Figure 4 illustrates this increase in power of the 
two-tailed test of the mean for samples of size 1, 4, and 9 from normal 
populations with unit variance* and fio is the mean under the null 
hypothesis. 

For almost every statistical test in use, including all those to be 
discussed in this chapter, increasing sample size increases the power of a 
statistical test. 

To summarize, (1) the power of a statistical test is the probability 
of rejecting the null hypothesis when it is false; (2) po^ver is relative 
to the model employed and to the alternative hypotheses (possibilities) 
entertained; (3) as a general rule, the power of a statistical test increases 
with sample size. 

POWER EFFTCiEScy. Vnfortunatclyf there are considerations other 
than those of power that must be made In the choice of a statistical test. 
Is the t«t simple computationally? Is the model (set of assumptions) 
required for this test “true to life**? Leaving the first question for the 
moment, let us consider how the second might be answered. 

It was Slated earlier that the weaker the assertions constituting 
the model, the more general the conclusions but the less powerful the 
statistical test. This is very generally true /or a given sample nge but not 
necessarily so if the tests use diffcrcnt-sized samples. Test A may be 
better (more powerful) than test B for samples of size twenty, but test 
B may very well be more powerful with a sample of size twenty than 
test A is with a sample of size ten. In other words, it is necessary to pay 
for increased generality of conclusions with a larger sample. Power 
efficiency is a measure of how much one has to pay in any specific case. 

If test A with a sample sizeN^ has the same power as test B with 
the sample size where test B is the most powerful lest (kno^rn or 
hypothetical) for Ns observations, then test A has power efficiency 
^00(Ns/N4} percent. If test yf xcquircs a sample ot 10 to be as powerful 
as test B with a sample of size 6, then test A has power efficiency of 60 
percent. The question becomes **l8 it worth the extra expense of taking 
. a sample this much larger to arrive at more general conclusions’” 

*^6 preceding paragraph is adequate when there is only one 
alternative hj'pothcsis. When there arc more than one, the phrase “as 
powerful as’* loses meaning. Test A may be more powerful against 
alternative one but less pow'crfuf against aftemalive two. Ifow, in this 
instance, can v.’e say one test is more powerful than the other, or as 
powerful as the other? In Figure 5, which test is more powerful’ 
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No unique or "best” answer exists. Many rational answers can 
be found, two of which can be discussed here. A commonly used 
answer is the equal-area criterion. In Rgure 5, test A, by this criterion, 
would be as powerful as test B if the ruled area were equal to the cross- 
hatched area. Another means of equating power curves is to adjust 
sample sizes so that the power curves intersect at some central power 
level (say 0.50). The two definitions do not yield essentially different 
answers ordinarily, and the two tests in Figure 5 are a case in point. 



Fig. 5. A portion of the power curves of two hypothetical tests. 


A second and last important complication is that power efficiency 
itself may vary for different-sized samples. Test A may have a power 
efficiency of 0.70 with samples of 5 and a power efficiency of 0.95 with 
samples of 100. A test’s power may increase rapidly with sample size 
while the power of the most powerful test is increasing only slowly. 

To summarize, then, power efficiency is a measure of the power of a 
statistical test relative to the most powerful test possible. Although 
power efficiency may be defined in several ways, the definitions arc 
roughly equivalent. Finally, power efficiency is not completely inde- 
pendent of sample size in all cases. 


TESTS OF HYPOTHESES WITH 
ORDER STATISTICS 

The rest of this chapter will be devoted to a discussion of a number 
ol nonparamctric tests and estimation procedures. All available in- 
Iciiration concerning the power or power efficiency of each test will 
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be presented. The tests, in almost all cases, mil be compared with 
tests based on normal models. Unfortunately, the power and the power 
efficiency of a test cannot be determined unless some definite functional 
form for the underlying distributions is specified. 

For several reasons the tests will be compared entirely on ffie 
basis of underlying normal distribution— that is, the power efficiencies 
of nonparamctric tests rclaUve to the corresponding normal tests will 
be given for those cases where the normal tests are strictly applicable. 
We shall be answering the question “How much do we lose by i^g 
these nonparametric statistics when we could use normal statistics?” 
“How much could our apprehensions cost us if our fear (of non- 


normality) was unjustified?” 

The more pressing question, “How important is it that my rMulu 
apply generally rather than to populations distributed normally, 
must, of course, remain unanswered, since it can only be answered 
by the research worker each time he considers a test of experimental 


aata. 

The experimental tests presented mil fall into two 
mutually exclusive groups which for convenience mil be ladled < hr 
of location and tots of relclion. Those correspond, respMtively, to the 
ordinary tests of mean difference and tests of correlation, 
analogous to the analysis of variance occupying some middle category. 


Tesfs oj Location 

THE SIGN TEST. This tcst IS uscful in cas« where ^ rtJwwa- 
diffcrcnccs is ordinarily used; that is, where ® ° ^ 

lions is available. A common application is the before-after 
experiment, in which measurements are made on ca su jec , 
treatment or stimulus is applied to each, and the measurements ^ 
repeated. The nuU hypothesis Ho would be that th^ « 
whereas the alternatives could be cither Hit t e e « t 

(negative), or Hit there is an effect. For the purposes of the sign test, 

these arc reformulated in this way. iwCnre 

Ho: The median of the distribution of 
and after measures is zero, against H,: this median is positive (negative;. 

or Hi; this median is not zero. 

The power cffidcncy (in the remainder of this chap 
efficiencies arc quoted for 5-pcrccnt level tests) of this 
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samples is rather low, ranging from 63.7 percent for larger samples 
up to 68 or 70 percent for small samples. The sign test with 18 pairs 
of observadons is approximately equivalent to a / test with 12 pairs 
when the t test is applicable, 40 pairs for the sign test compared vdth 
28 or 29 for the t test. 

Let us suppose that a questionnaire about attitudes toward 
Negroes has been administered to a group of 22 subjects, after which 
the subjects undertook an extensive study of race prejudice. The follow* 
mg results vk ere obtsuned on a follow-up questionnaire. ' ' 


Sui^ect 

After 

Before 

Difference 

Subject 

After 

Before 

Difference 


35 

31 

+ 4 

12 

28 

27 

‘ + 1 

2 

36 

29 

'+ 7 

13 

27 

26- 

+ 1 

3 

29 

35 

- 6 

14 

27 

25 

+ 2 

4 

34 

32 

+ 2 

15 

33 

32 

+ 1 

5 

33 

29 

+ 4 

16 

42 

40 

+ 2 

6 

28 

33 

-T 5 

17 

19 

18 

. + 1 

7 

33 

30 

+ 3 

18, 

37 

36 

1+ 1 , 

8 

28 

38 

-10 

19 

40 

. 40 

i 0, 

9 

28 

35 

- 7 

20 , 

32 

. 31 

1. 

10 

25 

22 

. + 3 

21 

31, 

.27 ^ 

+ 4, 

11 

33 

31 

+ 2 

22 

30 

29 

+ 1 


\ k \ 

If the null hypothesis were true— that is, if these 22 differences 
Were drawn from populations With a median of zero— we should expect 
about half of them to be positive and half negative. We see that, in feet, 
17 arc positive, 4 negative, and one is zero.* We should expect that^the 
distribution of 4*’s and — ’s would be about the same as that of heads 
and tails in tossing 22 unbiased coins. The probability of 17 heads and 
4 tails (one coin rolling out of sight) u less than O.Ol, so we would reject 
ffe at the 1 -percent level if our alternative hypotheses are specified by 
Hi. Had we decided beforehand to use the sign test with the one-tailed 
alternatives Hi: the effect is positive^ we would have rejected Ho at 
the -percent level. ' ' ’ ' : 

The t test of differences of cither the one-tailed or two-tailed form 
here would have yielded a quite inrignificant result {t»ul — 0.60) which 
would have ignored the fact that'although the differences were small, 
they wcac almost all in the same direction. ^ • 

Probability levels for the sign test arc tabulated in Dixon (1). The 
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5-percent level of the two-t ailed tes t can be obtained from the formula 
[(IV — 1) / 2] — (0.98) VfV -b 1, the level being the intepal part of 
this value, where N is the number of pairs in the sample, ignoring J1 
paira with a difference of zero. In our sample, iV = 21, the formula 
yields (20/2) - (0.98) \/22 = 10 - 4.596 = 5.404 and we shall 
reject the null hypothesis at the 5-percent level if there are 5 or fewer 
plus or minus signs among the 21 differences. . . - . . 

The sign test is perhaps the simplest of all distribution-free tests 
and at the same time is most readUy generalized to analysis-of-vanance- 
t^e problems. This generalization wUl be discussed after the presenta- 

tion of a number of two-sample tests. , ,*f ir 

THE WAim-woEEOWtTZ EUN TEST (16). The Wa'd^° z Run 
test is specifically designed to test the null hypot esis o- . 

were drawn from populations having the same continuous distributiom 
The alternatives me the extremely large class 

samples were drawn from different continuous istn u lo * ’ 

then, would tend to reject the nuU hypothesis if two populations were 

"'S'rzs“ rz'Siti.;: 

implies that gross measurement is the only reason or 
to have the same value. a small 

The power efficiency of this test is 
number of empirical tests by the author ha« ■"dtna'od a "umber 
somewhere in the neighborhood of 75 percent *" J''’™ , 

the distributions differed only in means and when samp 

Let us say that we have two groups of 
large city, one of which has just completed a “urs^n t P 
tion, throther group not having had this course 
hypothesis that our religious-education ooutse a " 
toward religion as measured by a new scale. Say, also, that the sco 

on the attitude scale arc as follows: 

R,li,ious education: 52, 53, 71. 86, 95. 108. 115. 120, 141, 152. 
165.218. 

IVo religious educulion: 30. 45. 54. 74, 75. 81. 101. 104, 146, 151. 
170, 171. _ 

The test consists in arranging all the from 

Order and then counting the number ° 74 75,5/, 86, 

group. In the example we have 30, 45, 52. 53. Sd, 71, 74, 
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95 101 104, 108, 115, 120, 141, 146. 151, 152, 165, 170, 171, nS O” 
tes’t statUtic Is the number of runs, r, which in this case is 12. Under the 
null hypothesis, fo is given by 

ro = £ (0 = (2 NiNt)/{Ni + JVj) + 1. 

. f 3 2 NiNj {2NiNt ~~ N i — jVi is the 

The vanance of r, «r, - (Afi + iV, - l/ 


number in one sample, Nt the number in the other. 
In the example Ni ^ Nt = 12, 


(12) (12) ^ ^ 

24 

2 (12) (12) (264) ^ -4 

(24)» (23) 


13. 


2 

To = - 


= 2.40. 

For values of iVt and Nt larger than 10, r is approximately normally 
distributed and large values of r tend to confirm the null hypothesis. 
Hence, we form the normal deviate C « (r - ro)/tfr and *'®j*®* ”1* 

null hypothesis at the 95-percenl level whenever (r - r(j)/crr < -1.W5. 

In thU example, C =• -1/2.40 = -0.417 and the null hypothesis 
would not bo' rejected. Our conclusion would be that we have no 
evidence that such a course has any effect on attitudes toward religion. 

Had we assumed normality and used the two-tailed t test 
significance of difference between means, we should have reached 
about the same conclusion (t = 0.73), except that it would have 
qualified by the clause “if scores on this scale are normally distributed. 

It is important to note again, that, although we used a norm 
deviate to find the probability of our observation, the validity of the 
test depends in no way on the normality of the observations but only 
on the distribution of r in two samples from the same population. 
That this distribution becomes normal for large samples can be proved 
mathematically- We use it here as an approximation to the twe 
dbtribution, which is much more difficult to use. For sample sizes o 
and 12, the approximation would call 9 runs significant, at the 
percent level, whereas the true distribution would call 9 runs sign! cant 
at the 6.9-pcrcent level. The correspondence would be even closer for 
larger samples.* 

»For exact lignificancc levels for Nu W» < 10. Swed and Eisenhart (14). 
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Theoretically, if the original population is continuous, there 
should be no two observations with the same value. Because of coarse 
measurement, however, such tics do occur. When they occur the 
sequence of ranked observations is not unique. Any of several sequences 
may adequately describe the data, and there is no logical way of 
distinguishing between them. Of course, if all ties are within one sample, 
the number of runs is the same for each sequence and our test is 
unaffected, but if observations in one sample arc tied with observations 
in the other, the number of runs cannot be uniquely determined from 
the data. 

Let us consider our original example, slightly altered* so that there 
is one tie across samples. The combined, ordered sample might be: 30, 
45, 52, 52, 53, 54, 71, 74, 75, 81, 86, 95, 101, 104, 108, 115, 120, 141, 146, 
152, 165, 170, 171, 218. Here r == 12 or 14, depending upon which 
observation of 52 is in reality the smaller. Neither value would be 
significant, and consequently our decision would be the same. If the 
proportion of tics is very large, ordinarily the number of runs is very 
indeterminate, often ranging from very significant to very insignificant 
numbers, and in such cases this test is inapplicable. 

THE MANN-VVHITNEY OR WILCOXEN TEST (6, 18). Although SUbjCCt 
to the same ambiguities when ties in rank occur in the data, the Mann* 
Whitney or Wilcoxen test is valuable since it provides a one-tailed set 
of alternatives. The null hypothesis Ho is again that the two samples 
are drawn randomly from populations F and G having the same 
distribution. Since this, like all nonparamctric tests, is a test of distribu- 
tions, the alternatives must be stat^ rather differently. The statement 
oC ‘a, Ki*. F 'A 'A'av. G. If X is. owt 

observation from the distribution i^and Kis from the distribution G, F 
is stochastically larger than G if the probability of A’ larger than Tis 
greater than Vz. Loosely speaking, this implies that the “bulk” of Fis 
farther to the right than the **bulk” of C, or that 7^ accumulates more 
“slowly” from the left than docs G. If the two populations arc normal 
'vith the same variance, this implies that the mean of 7^ is larger than 
the mean of G. 

The Mann-Whitney test makes exactly the assumptions of the 
Wald-Wolfowitz test. 

Van der Vaart (15) has investigated the po\>cr function of the 
Mann-Whitney test for normal samples and has found that, although 
<hc / test is more powerful, the differences in poucr arc quite small for 



552 The Analysis of Data 


small samples and not very great even for larger samples This test is 
probably the simplest and most powerful nonparametne test yet de- 
vised for detecting differences in location The ^Vald-^Volfo\^.lt 2 test 
seems to be better for detecting differences in dispersion 

As an illustration of the use of this test, suppose that from our 
previous example our alternative hypotheses had been Hi students 
who have taken religious education have higher scores on this attitude 
scale than do other students Our statistic V is given by the number of 
students without religious educauon who have higher scores than a 
specific student with the education This number is then summed over 
all students with religious education Let us tabulate this statistic for 
our example. 

TAILS t 

Aten^ff (J non ttUgious-^iucaixon 
students hihet 
0 
2 
2 
4 
4 
4 
4 
6 
6 
9 

to 

10 

U- 61 

If the null hypothesis is true, the expected value of Ho is HiV*, 
or 72 

The variance is = 1/12 HiH, {Ni + H* + 1) or 300 When 
the two samples are both larger than 1 0, (H — Ha) /<rc is approximately 
normally distributed, and we shall reject the null hypothesis at the 
5 pcrcentlevchfthisislcssthan-1 645 Inourcasc,C* -ll/(10-s/3) 
= —0 635, whicl would have been significant at the 26 percent level 
Tics in rank have less effect on V than on r, but large numbers 
of ties still viuate the value of the test When ties do occur, they should 
be counted as point 


■Score on otMuie scale Jor 
TtUgietss-tdueoUon students 
218 
16S 
152 
141 
UO 
115 
108 
95 
86 
71 
53 
52 
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marshall’s test (7). Marshal]’* Test is a iargc-saraple test foi 
exactly the same purpose as the preceding test. It has the additional 
important feature that tics in rank are quite unimportant in its applica- 
tion. Also important is the fact that it may be applied to group^ data 
for two samples if each sample is measured on the same continuum. 
It is to be expected that this test is very useful to the survey researcher, 
■whereas the preceding test is more useful to the experimenter. 

The power efficiency of the Marshall test for extremely large 
samples has been investigated for normal samples with known vari- 
ances. Preliminary results indicate that the power efficiency of the 
test is larger than 0.90 if the number of class intervals is larger than 5. 

Marshall’s original paper (7) contains an interesting application 
which should be read as well as the one presented here. The data 
presented here are taken from records of the 1937 American Council 
Psychological Examination. The sample size is extremely large, but 
use of the test requires only that each interval contain at least a total 
of 10 observations in the two samples. The hypothesis to be tested will 
be Ho‘. males and females have the same distribution of scores on the 
1937 ACPE, against the class of alternative hypotheses Hii males have 
higher scores than females. The original data are contained in Table II. 


rA»it If 


Scores 

Mate 

Female 

Totai 

0-29 

32 

27 

59 

30-59 

583 

387 

970 

60-89 

2,061 

1,606 

3,667 

90-119 

4,164 

3,439 

7,603 

120-149 

5,945 

5,C90 

11,035 

150-179 

6,536 

5,376 

11,912 

180-209 

5,452 

4,440 

9,892 

210-239 

3,819 

3,153 

6,972 

240-269 

2,227 

1,832 

4,059 

270-299 

1,116 

789 

1,905 

300-329 

433 

248 

681 

330-359 

123 

60 

IB3 

360-.. . 

9 

3 

12 


32,500 

26.450 

58,950 


Each column or the original ubk ia then cumulated in Tabic I 
to provide the data from which vre work. 
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13 360-.. . 32,500 26,450 58,950 1.0000 0.0000 0.0000 0.00000 0.00000 

Totals ^55,747 209,050 1.32630 1.08210 
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The five columns are all computed on the basis of the 
umulated Total column. The column /. is the cumulated percentage 
ot the total (for example, in the third row 0.0796 = 4696/58,950); 

IS one minus pi\ each S is the sum of the two entries immedi- 


ately below it in columns 2 gi and y,; the headings on the last two 

columns are self-explanatory. Where X is the sum of the cumulated 
frequencies in the hypothesized larger sample, m the number in the 
sample, Y the sum of the cumulated frequencies in the other sample, 
and n the number in its sample, the test statistic is 


S = (Y/n) - (X/m). 

The variance of S is 


"S = + (l/«)] fs + 22 

L»“< »-> *-»+» J 

whcrej is the number of categories. 

Our test statistic here is 5= (209,050/26,450) - (255,747/32,500) 
= 7.9036 - 7.8691 = 0.0345. The variance of 5, = [(1/32,500) + 

(1/26,450)] [1.0821 -f 2 (1.3263)], where 32,500 and 26,450 are the 
sample sizes, and 1.0821 and 1.3263 are the sums of the last and 
next-to-last columns. 


(0.68576 X 10“<) (3.7347) « 0.00025611, and = 0.0160. 

■S’ is approximately normally distributed, and the normal deviate 
In this case is C = 0.0345/0.0160 = 2.156. The significance level ob- 
tained from the upper tail of the normal curve for our result is 0.016, 
between the 1- and 2-percent level. 

SMIRNOV TEST (8). The Smirnov test is based upon the same 
f^soning as the Marshall test. It is a good deal simpler computa- 
tionally, tests exactly the same hypotheses, and also is valid for large 
samples only. Its use is not recommended over the Marshall test unless 
time is an important factor. 

Little is known of the power efficiency of the Smirnov test, except 
t at it is almost certainly less power efficient than the Marshall test. 

*t is believed to be more powerful than the chi-square goodness-of-fit 
tests (8). 

We shall use the same data contained in Tabic III to illustrate 
the application of this test. The numbers needed will be the sample 
t'^cs, 32,500 and 26,450, and the cumulative percentages for each 
■tttnplc as contained in Tabic IV. 
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TAtLE IV 



Cumulated 

Cumulated 


percentage. 

percentage, 

Scores 

female 

male 

0-29 

0.00102 

0.00098 

30-59 

0.01565 

0.01892 

60-89 

0.07637 

0.08234 

90-119 

0.20639 

0.21045 

120-149 

0.39883 

0.39338 

150-179 

0.60207 

0.59449 

180-209 

0.76994 

0.76224 

210-239 

0.88915 

0.87975 

240-269 

0.95841 

0.94827 

270-299 

0.98824 

0.98261 

300-329 

0 99761 

0.99593 

330-359 

0.99988 

0.99972 

360-. . . 

1.00000 

1.00000 


Dijfrenct 

0.00004 

-0.00327 

-0.00597 

-0.00406 

0.00545 

0.00758 

0.00770 

0.00940 

0.01014 

0.00563 

0.00168 

0.00016 

0.00000 


Our test statistic is the maximum difference of cumulative per* 
centages, which is 0.01014 in this cas e, multiplied by a factor which 
takes into account the sample mxc, Vn«i/(n + m), which in our case 
is 120.75. The product of these, X, is 1.2244. The probability level is 
g-s(i.n*4i’ or approximately 0.05. In general, the S-pcrccnt level of X is 
1.224; the 1-percent level is 1.517. The significance level must be 
interpreted somewhat differently here, since it is in reality an upF>er 
bound to the true significance level. Had we computed X using the 
largest difference from all possible classifications, we should have found 
the true significance level, which would, of course, have been at least 
as small as the one we obtained here. 

It is rather interesting to note that the normal test of difference 
between means, which is probably applicable to these extremely large 
samples, yields a difference in means of 1.033 with a standard error o 
0.474, wWch is significant at about the 1.5-perccnt level for a one-tailed 
test. The correspondence with the Marshall test is amazingly close. ^ 
pitman’s randomization ttsts (11). Pitman’s Randomization 
tests are a set of nonparamctric tests which arc applicable in a wide 
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number of situations but which, unfortunately, are much too difficult 
computationally to be used often. The tests are less difficult for ex- 
tremely small samples (n < 10). 

Let us suppose, as an example, that we have the responses of two 
groups of five people each to the question “How much time per week 
should a man spend doing home repairs?” We wish to test the hypoth- 
esis of no mean difference against the alternative set of hypotheses 
that the Group 1 population has a different mean from the Group 2 
population. Let the data be as follows: 


Person 

Group 1 

Group 2 

Person 

(I) 

15.20 

14.45 

(6) 

(2) 

20,10 

6.30 

(7) 

(3) 

7.95 

10.90 

(8) 

(4) 

10.80 

8.10 

(9) 

(5) 

6.85 

8.40 

(10) 

Xx = 12.18 

Xi - 9.63 

Xx-X,^ 

2.55 


Now if the null hypothesis b true, all ten observations are from a 
common population, and the splitting of the sample in two is merely a 
matter of chance-i.f., any of the (5*’) = 252 ways of carrying out 
such a split is equally likely. On the other hand, if the alternative 
hypothesis is true, we expect that there would be more than a chance 
difference between the group means. Only once in 252 such experi- 
ments should we expect the five largest observations in one group and 
the five smallest in the other group. If such a thing ' occurred, we 
should reject the null hypothesis at the 0.4-perccnt level. Similarly, 
the observed mean difference would be among the twelve largest 
possible such differences less than 5-perccnt of the times if the observa- 
tions are from a single population— i.r., if the grouping is ran om. 
Here, then, is our test. We simply attempt to find 12 groupings of the 
10 observations which give a larger mean difference than the one 
(2.55) we found. If the one we found is among the twelve larg»t, we 
mject the null hypothesis at the 5-percent level. If it is not, we do no 

■ In our example, we cannot reject the nuU hypothesis. More than 
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iwdve combinations which give a larger mean difference are shown 
below. 


Persons in Group 1 

X, 

— Xt 

1,2, 6, 8,4 

14.2 

6.59 

1, 2, 6, 8, 10 

13.72 

5.63 

1,2, 6, 8,9 

13.66 

5.51 

1,2, 6, 8,3 

13.63 

5.45 

1,2, 6, 8,5 

13.41 

5.01 

1,2, 6, 8,7 

13.30 

4.79 

1, 2,-6, 4, 10 

13.70 

5.59 

1, 2, 6, 4, 9 

13.69 

5.57 

1, 2, 6, 4, 3 

13.61 

5.41 

1,2, 6,4. 5 ‘ 

13.39 

4.97 

1, 2, 6, 4, 7 

13.28 

4,75 

1,2, 6, 9,10 

13.25 

4.69 V 

1, 2, 6, 10. 3 

13.22 

4,63 


Under rather general conditions, this test leads to the same con- 
clusions as would the ordinary I test, but the power efficiency of the 
Pitman tests in nonnormal universes is unknown. 

THE MEDIAN TEST (17). Thc median test is rather similar to the 
run lest presented earlier and is a generalization of thc sign test.* It is 
subject to somewhat less difficulty with ties in rank than is the run test, 
since tics are important only when they occur at the median of thc 
combined sample. It has the further advantage of being rather in- 
sensitive to difTcrenccs in dispersion. If, for example, the ordered 
samples A and B took the form aaaabbbbbbbbaaaa, where a is an observa- 
tion from thc A sample and b an observation from the B sample, the 
run test would reject the null hypothesis of no difference in the two 
populations, since there arc only three runs. Since the samples have 
thc same median, thc median test would not be significant. l*he dif- 
ference in thc two samples is thc wide dispersion of A compared to the 
dispersion of B. Thc hypotheses tested here are Hq: thc two populations 
have thc same median; against Hix the medians arc different. 

Results on thc power efficiency of this test are not available. As 
was noted above, it fhould be more powerful against alternatives which 
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specify differences in location than against alternatives specifying 
differences in dispersion. 

In application, the test is particularly simple. Let us say that we 
have ni observations in one sample and nz in the other. We find the 
median of the (ni + ns) observations and count the number of observa- 
tions in each sample above this median. If the two populations have 
the same median, we would expect these numbers to be about ni/2 and 
ns/2, Deviations from this expectation are tested using x* as a criterion. 
If we call the observed number of observations above the median in 
each sample mi and ms, we need onlycompute xid f. from the tabulation: 


fTji mj («i + ns)/2 

— mi ni — ms («i + «2)/2 

«i .< ns ”1 "b 


Let us take as an example 30 subjects drawn randomly, of whom 
10 are male and 20 female. They score in the /oilowing manner on a 
short attitude questionnaire. 


Afdn; 12, 21, 28, 37, 38, 38, 39, 40, 42, 51. ’ ^ 

Womtn; 12,' 13, 15, 15, 21, 23, 24, 24, 25, 27, 30, 31, 32, 33, 34, 
34, 36, 38, 43, 44. 

The median of the combined sample is 31.5-i.r., 15 observations 
3re smaller, 15 larger. The chi-square test is 

Men Women 


Larger 

Smaller 


7 

3 

10 


8 

12 

1 

20 , 


15 

15 

30 


Here xU,, = [30(60).y[10-20.15-15] = 2.40, with probability level 
^bout 12 percent. The < tcit of differences between means yields a 
significance level of about 8 percent « = 1.78). We c“^ 

*at we have no evidence of any difference between men and wome 
^his questionnaire. . .,„^ivzcd 

The example used to illustrate the run test can a so 


here, 


I yielding a x* table. 
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7 5 12 

5 7 12 

12 12 24 

X?d 1 = 0.67, which is not significant even at the 50-pcrcent level. 

When the combined sample contains an even number of observa* 
tions with no tied observations at the median, there will be (ni + «*)/2 
observations larger than the median and the same number smaller 
than the median. When (n, + nt) is odd, the sample median is one of 
the observations and it is not determined in which cell that observation 
is to be entered. If we make the rule arbitrarily to include it in the 
class less than the median, the test In the long run will not be affected. 
The same arbitrary rule provides us with an approximate solution 
when ties occur at the median; that is, to call mt and mj tnc number of 
observations above the median, and ni — mi and m — m* the number 
of observations less than or equal to the median. It is only approximate 
because our measurement is too crude to differentiate between observa- 
tional values. 

AN ANALOGUE OP TKE ONE-FACTOR ANALYSIS OF VARIANCE TEST* (9). 
This test is a straightforward extension of the median test in the same 
way as the simple F test is an extenrion of the t test. We go from testing 
the null hypothesis that two medians arc equal to the null hypothesis 
that a number, k, of medians arc equal, against the alternatives H, 
that at least one of the k medians is diffeient from the others. 

No information is available concerning the power efficiency of this 
test. Empirical studies comparing this test with the F test in normal 
populations have yielded quite good results. 

Let us say that we have k different levels of the factor with 
observations at the ith level. We find the median of all the observations, 
count the number of observations at each level above the median, the 
mCs, and enter these in the first row of a 2 by ^ table. In the second row, 
we enter the number n* — mi. The x’ test of independence of rows 
from ^lumns is then a test of significance of differences among medians. 

As an example, we shall analyze the following data from Edwards 

( 2 ). 

•Various other Donparametrie analytes of tpccific experimental detigns are 
discutted fully In Mood (9). 
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Levels 



1 

2 

3 

4 

5 

13 

7 

12 

10 

13 

9 

4 

11 

12 

6 

8 

4 

4 

9 

14 

7 

1 

9 

7 

12 

8 

10 

5 

15 

13 

6 

7 

10 

14 

10 

6 

5 

2 

10 

8 

7 

9 

8 

17 

4 

6 

5 

3 

14 

9 

10 

8 

6 

12 

11 

Mean X 8 

6 

7 

12 

10 = 8.6 

Med. {^) 7.5 

6 

7 

12 

10.5 Grand roed. (X) 8.5 

An F test of these data yields 

a highly significant result (F4,45 

6.52). Our test is as follows: 




1. The median of the 50 values is 8.5, 


2, The. 2 by 5 table is 







Level 



1 

2 

3 

4 5 

Above 8.5 

3 

2 

4 

9 7 25 

^low 8.5 

7 

8 

6 

1 3 25 


10 

10 

10 

o 

o 

o 

xUf. = 

(5-3). , (5-7)«_ 
5 5 

. (5-2)* 
5 

+ ... + (3:^’=.3.6. 

This value, though not as significant as the 
attains the 1-percent level. The test would not be cxpcc g> 
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rs,i I I I + I++ + + +I + I + I + I 


i'0c\'^oc»ooocooo'owr*0'0 


I++I + 1 1 + 1 + 1+ 4*14-1+1 


e I ^ ^ ^ ^ ^ , 

§ d5 'o O' « 2 


1 1 1++I++I+) j+j+++t 




&I++I + I I4.I++I + I + I + 


Median 61.5 53.5 78.5 78.5 

•a negative sign indicates that the observation is below the column median; a positive sign, above the mediant 
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as Significant results, since the samples were drawn from normal popu 
lations A further reason for our test’s moderate lack of sensitivity is 
that we have used only an approximate value for our x“ test Using the 
maximum likelihood value (9, p 276) we obtain a x* = 14 97, which 
attains significance at the p^-percent level, much closer to the sig- 
nificance level of F We would reject, at the l-percent level, the 
hypothesis that the 5 samples were drawn from populations with the 
same median 


Tests of Relationship 


This section will be devoted to a rather brief discussion of tests of 
association in a bivariate population None of the methods in this 
section has been adequaiel> investigated for power efficiency They 
temain useful, however, especially when normality assumptions are 

•nioleiable 

One example (13) will be followed through the vanous tests 
discussed, and also through the following section on estimation The 
data consist of scores on a thirty-itcm attitude toward children scale 
obtained by eighteen jplcs before and after an educational program 
point scoring was used, and the seventy two scores obtained 
ranged from 56 to 110, the maximum possible range being 0 to 120 
^ high score indicates a ^‘permissive” attitude, a low score a “rigid* 
attitude The data arc contained in Table V 

■niE CONTINGENCY TEST (9) The first and simplest test is the test 
of whether the linear regression line of one vanable on the other, fitted 
"onparamctncally (9, p 408), has a slope of zero The test becomes just 
^ *est of linear independence in the 2 X 2 table formed by the X median 
®"d the Y median Under the null hypothesis, one would expect about 
"/4 observations m each cell Departures from this indicate associa 
, high values of one variabic-tend to be associated with 
(low) values of the other This is called positive (negative) associa 
non 


I'our different tests of interest arc possible on the data m our 
'sample The first test v^ ill be, “Is there significant association between 
and wives’ scores on the ‘before’ test’" The 2 X 2 table shows 
positive association, but with a x* of 2, is not significant at the 
l “ level It should be noted that the cell frequencies are not 

enough to make the chi squire test accurate 
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Husband 

>mtd. <med. 

>med. 6 3 

Wife 

<incd. 3 6 

9 9 

The same test on the “after” questionnaire can be made on the 
2X2 table with the same results as before. 

Men 

>med. <.med. 

6 3 9 

3 ■' 6 9 

9 9 J8 

The other tests, women before and after and men before and after, 
show highly significant association, as one would expect. The phi 
coefficient can be used as an estimate or index of association, being 
+0.33 in these cases. 

We should not expect this test to be very powerful, since only a 
few slight shifts in scores near the median could radically alter the 
cbnclusions drawn. Its case, however, recommends it as a simple 
preliminary device to find highly related variables. 

THE CORNER TEST (10). This test lays much more emphasis on 
the association observed between the two variables at the extreme 
values of each than does any similar test so far proposed.' In application 
it is not quite so simple as the one above, but it would seem to be much 
more powerful. This emphasis on extreme values is a rather valuable 
contribution of the test, since one is often most interested in the 
characteristics of persons who deviate a great deal from the norm. 
This interest is often due to the feeling that minor extraneous factors 
may disturb the relationship between two variates where neither is 



9 

9 

18 
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ihlmher vlr“f o- or 

e other variate is present to a high degree. 

of thra«hrd”“T administration 

Figure 6 ^ Proviously. The scores arc plotted in 

“'“"5 ‘•’O medians, f and ?. The 

additional boundary lines arc added as follows: 



WIFE TEST 

Fic. 6. An iJJustrati'on of the use of the corner twt on the data of 
Table V. 

!• Zix is drawn parallel toj' and as close to as possible under the 
. tbat all pointsabovc Zj are on one side of Call this the* on 

0 of ij. jf were closer toji, a point to the left of J would be aboi’c 

Sid ” '*’' F°‘"' ‘*'0 “o'F’ "■'* 0 PO‘"‘ **'0 

r> the boundary line is drawn through them. 

2- Similarly, all points to the right of i, lie on one side of J-. those 
I, are on one side of i, and points to the left of ii lie on one side 
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3 Pomts above I. are eounled, and a plus or minus sign is 
attached according as the points are to the right or left of X In our 
illustration, this number is 6 

4 Points to the right of U are counted and a plus or minus sign 
is aitached according as they are above or below Ji In our illustration, 


this number is 3 t 

5 Points below U are counted and a plus or minus sign is attached 
according as whether the points are to the left or right of JT In our 
illustration, this is 1 5, counting only half of the tied point 

6 Points to the left of L, are counted and a plus or minus sign is 
aitached according as they are below or above ^ In our illustration, 

this IS 3 u * * 

7 These numbers are algebraically summed to get the test 
critenon (m our illustration, r = 13 5) 

8 In the limiting case, as sample size becomes large, the 5 percent 
significance level is given by lf| > 11» and this criterion is adequate 
if sample size is larger than 10 For sample size of 11, not all pomte 
need to be outside the boundary, since “corner ’ points are counted 
twice In our case, r « 13 6, so the null hypothesis is rejected at the 
5 percent level 

It should be noted that, both in this test and the preceding one, 
the procedure cannot be followed without modification if there are ties 
either at one of the medians or on a boundary line, or if there is an odd 
number of observations 

When there a an odd number of observations, one observaUon 
will he on each median For example, the median of x values might be 
71 and the median of^ values 84, with the two observations being 
(71, 19) and (47, 84), and we could not tell whether either point was 
on the “on” or “off” side of the median The ambiguity can be re 
moved by substituting the observation (47, 19) for the two observations 
This substitution docs not bias the test in any way In this way we 
have an even number of observations without affecting either median 
It IS possible, of course, that the odd observation lies on both medians 
In our example it might be (71 84) In this case we may neglect this 
observation, since it cannot affect the quadrant sums 

Ties not on a median arc important only if they affect the position 
of one of the boundary lines L In Figure 6, the two observations 
(61, 43) and (81, 43) are lied on Lj Consequently, we do not know 
whether to count (61 43) as outside or inside the boundary The 
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there b f ™‘side Had 

mere been an observation (70, 43), we should have counted it as U- 

bounda^'as ^ 'm'* “f 'he 

number on th e “on" side of the median 
1 + number on the “off” side of the median 

right 5 P°mm had Iain on i, to the left of f and 7 to the 

S » the 5 points would contribute ^ to the value of r. 
old class of statistics is perhaps the 

cei^^d^*f ^ntjwn of all nonparametric statistics. Originally con- 
corr^I ? ® shorthand method of estimating a product-moment 

tion^ ^ conception which forced acceptance of many assump- 
ns rarely satisfied in practice, plus several orders of mathematical 
have come more and more to be considered as 


indices of 

No 


association per se. 

sam 1 ^^^’^^Ption need be made about populations except that 

sim ^ observations can be arranged in rank order— i.r., that a 
reaf ^ exists, and even where there are tics in the sample, 

^jjjsble approximations can be used, 
of V ®Poarman coefficient p (rho) is the earliest and best known 
'luitc*^ statistics. It ranges between plus and minus one and is 

to the 

Kendall’s t (5), to be discussed later, is in all other ways 
*'®™Poting formula is p = 1 — [2(/|]/f(l/6) (n’ — n)], 
the number of objects ranked and d, is the difference between 

sn ” *‘3nk on one ranking and its rank on the other. For example, 

•h attitude items on anti-Semitism are ranked by two judges; 

much anti-Semitism one would have if he endorsed each 


simple to compute. This, and the fact that it is closely related 
concordance coefficient, are the only reasons we have for includ- 


,^^tis.how 


Sui 


PPose the rankings were: 


Item 
Judge 1 
Judge 2 


4 

4 

1 . 


= (3 - 3)= + (1 - 2)= + (2 - 4)< + (4 - D’ = 
Wh' “ ’ ~ tl4]/[(I/6) (60)] = -0.40. , T- , 

" “a occur, the process becomes slightly more difficuit. Tics 
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.eea so saise, 

The rankings would have been 


Item 
Judge 1 
Judge 2 


2 

25 

2 


3 

25 

4 


and 2 f =■ 15 5 In the woul^ eplag 

bv Vo /6) (n* - n) - V 0/6) (n» - n) - V95 > 

tjy V V > * 1 E c /ft -ic n 41 The correction for tics m gej^ 

would become 1 — la a / /a »ni ^ ^-jj 

cral IS that we divide bv VO / 6) (»^ ~^\~T ^K-! j , num- 
instead of (1 /6) (a> - a) where T = 1/12 [2 (i J ^ ' . . _ 

her of objects involved m one tied ranking, and [/ - 1 /12 H f" ' 
a being the number involved m the other In the • > ’ 

3 5 6 5, 6 5, 9, 9, 9, tor example, T •= (1 /12) [(3' - 3) + {2 " ' 

( 3 . _ 3)] = (1/12) [54] - 4 5 If the other ranking were 3, 3. 3, a, 

3, 6, 7 5, 7 5, 9, 10, U = (1/12) [(5> - 5) + (2» - 2)] = 10 5 
correlation would be 1 — ((12 5)/\/ (160 5) (154 5)) = 09 
In the attitude-study example, the Spearman correlation be 
husband’s and wife’s scores on the first administration is 0 586, on 
second administration, 0 649 Couples seem to have become more a 
following the training course, although we have no way of testing 
The eorrelation between the rankings of men before and after tram 
IS 0 765, the same coefficient for women is 0 804 _ 

When a is larger than 9, we can test the hypothcsisjhat££ 3 > 
against the alternatives p pi 0, by forming t = pV (n — 2)/( 8 

which IS distributed as Student’s I with (n — 2) degrees o rce 
For the correlation between husbands and wives on the first a mi 
tration, I = 0 586V16/[1 - (0 58^ = 2 89 with 16 
freedom, which is not quite significant at the 1-percent leve 
smaller than 9, tables in Kendall (5) give significance levels 

When a number of rankings of the same objects exist, it is so 
times of interest to determine how much agreement there is among ^ 
rankings For example, we might have a number of judges ran m 
sck of statements for pro war feeling and we might wish to use aver 
rankings to get a best estimate of some “true ranking,” with _ 

judges more or less agree Wc would want to be assured that t 
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exists sense sssch ranking-.'... S: ,t:ra"meLt 

statements at random. Use ^ = 1.0 

ot the amount of agre^ent ^ ‘ ^ as much as possible, 

if ali judges agree pcrfectiy, “™ ‘‘ ‘ ^ 
the coeffscient is given by the formuia 

IP = 12S/K 

i..n m iudees. and 5 is the sum of 
svhere there arc n objects being ran , around a mean of 

the squares of the rank sums or , jf ^ consider the two 

[m (n + l)]/2. In our of the wife as tour 

test scores of the husband ^se number ot “objects” being 

rankings by “judges of the c^pl , number of “judges is 

ranked, the number of coupira, ..judge” is 9.5, and hence the 

. four. The mean rank assigned by J ^ the first 

average rank sum is ■* for the second couple (4 + 16.5 

couple is (16 + 12 + 1^ + ' " couple 44, and so on. The sum of 

+ 4 + 14) = 38.5, for ‘‘''‘^'^.“5^38). + (44 - 38)» + • • • • 
squares, S, is then (60 - f )’ + „ diat W' = 5693.5/7752 

In this case n »* 18, m »* 4, an tj.- W ^ 0 against the alternatives, 
= 0.734. To test the null hypoth««H .^^^^^^^.^^^ 

W' > 0, we can use the Sncdcco ^ degrees of freedom m the 

(I _ TIT) = 8.35 with [n [2/^])] degrees of freedom in the 

numerator and [(m — 1) There is a highly significant amount 

denominator, here 16.5 and ordering the couples with 

of agreement and we can^ pro 

respect to ‘‘permissiveness.’ correlation between the 6 rankings 

The average Spearmai^ra ^ and in our example 

is given by the formula p»t., 

is 0.644. devised for the same purpose as 

KENDALL’S TAU (5) cocl Cl ^^j^ough somcwhat morc difficult 
the rho coefficient mentione car^ being gcncralizable into a partial 
to compute, it has the advantag almost normal distribution for 

correlation coefficient, and of having 

samples as small as 9. . . .j. -s 2S/n (n — 1), where n is again 

The computing formu a i ^ determined in this manner: 
the number of objects ran c . one of the rankings. 

(1) Order the f-- each object count the number of 

(2) In the other ranking, wr 
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objects 10 Its right having a smaller ranV and the number having a 
larger rank Subtract the former from the latter 

(3) Sum the numbers thus obtained for all objects to obtain 5 
As a single example, consider the rankings 1, 4, 3, 5, 2, 6 and 2, 3, 4, 

1, 5, 6 If wc arrange the first m order, the order of the second becomes 

2, 5, 4, 3, 1, 6 Start with 2 Since 3, 4, 5, and 6 are to its right and 
larger, and only 1 is smaller, the contribution is 4 —■ 1 or “3 ” For 5, 
only one number lo us right is larger, 6, and 3 are smaller, so that 5 
contributes a “ — 2” the number 4 contributes a “ — 1,” the object 
ranked 3 contnbutc's “0,” and the object ranked 1 contributes a “+1 ” 
These contributions added give 3 — 2— 1+0 + 1, or 1 This is the 
salueofS t is (2) (l)/(6) (5) * 1/15 

In the case of tics, t becomes 

S 

Vo 5n (n - 1) - 7 V0 5n (n - 1) - U 

Fur Oils cocfricicnl T » 0 5il (( - 1 ) and 1/ = 0 52u (ti - 1), where 

1 and u arc used in ihe same wa, as in Spearman’s coefBcient For 
example, in Ihe ranking 1, 2, 4, 4. 4, 6 5, 6 5, 9, 9, 9, the correction is 
0 5 [3 2 + 2 I + 3 2], or 7 If another judge had ranked the same 
objects 3, 3, 3, 4 5, 4 5, 7, 7, 7, 9, 10. S would be 7+7 + 7 + 5 + 54- 

2 + 2+2+1 =38, Vo 5a(n-l'l - Twould be V(0 5) (10) (9) - 7, 
Vo 5n Jn -_l) - 0 5(7 would be ViO 5) (10) (9) - 7' and t = 
38 f V38 V38 = 1 , as we might expect, since each ranking contained 
the same number of lies and was in the same order 

If we label the columns of Table V m order 1. 2. 3, and 4, wc have 
rn =» 0457, t„ = 0 dSlJ, t„ = 0 477, t„ = 0 401, ts, = 0 559, and 
ri< = 0 493 

When ti IS greater than 10, t is about normally distributed with a 
mean of 0 and a lariance of [2/9) [(2n + 5) /(n’ - a)] In our example, 
[2/9] [41/305] =f 0029774 and a standard error, v, of 0 1725 
The normal deviate for r„ = 0 457/0 1725 = 2 649, which is sigmfi- 
cani Just under the l-perccnl Jctel and does not differ much from the 
lest of p for the same rankings 

To discuss partial correlation, let us consider the three rankings 
Q, and 71 

f 1 2 3 4 5 

C 1 3 4 2 5 

O 2 3 4 1 5 



Oistribuhon-frae Methods S7I 

Let us make a table with an entry tor each pair of objects A plus sign 
IS entered if the pair is in the order at the column head, a minus sign 
if it IS reversed 



(12) (13) 

(H) 

(15) (23) 

(24) 

(25) (34) 

(3Sj 

1 (4:.) 

p 

+ + 

"f* 

+ + 

+ 

+ + 

+ 

+ 

Q 

+ + 

+ 

+ - 

- 

-f + 

+ 

+ 

R 

- - 

- 

+ + 


+ + 

+ 

+ 

The correlation between d partialling out P, 

coeffiaent in the following 2X2 tabulation 

IS the phi 






Ranking R 






Patrs 


Pairs 




Pairs 


Qgrmni 

uilh P 


disagrtnng 

With P 


Totals 

Ranxiro 

agreeing 
with P 

Pam 


S 


3 


8 

d 

disagreeing 
with P 


2 


0 


2 


Totals 


7 


3 


to 


The phi coefficient IS — 6/\/(2) (3) (7) (8) = —0 327 This ib more 
easily determined from the formula 

’■qR p ~ [rgs — TPQrpifl/-\/tl “ [1 “• fpctl 

Here rgs = 0, rpg = 0 60, rpn = 0 40, and 

tqrp = [0 - (0 60) {0 40)]/- v /(0 64) (0 84) ^ -0 327 

The partial correlation can be interpreted as the correlation 
between the ratings Q. and /? when the factors on which P is rating arc 
held constant This interpretation holds whether the “judges ’ ore 
actually judges, or t'*sts, or any other instrument capable of arranging 
objects m rank order Unfortunately, the distribution of partial taus is 
not known In the short example just considered, \\c have some 
evidence, however, that when the factor on which the P ranking is 
based is held constant, Q. and R arc ncgati\ely related P might be an 
intelligence test, d a test of number ability, and R a test of musical 
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amluv in .ha. case one would “V *n. a«-o. 

;o:"ru:.tSv%Lan\^^^^ 

mg .he .raining period, partial, g ranking of the couples 

of the couples before .he .raining To “b.a n a 

husbands after .raining, par.iall.ng ou. .he factors common to those 

present before traimng, w 

T,. « = to 493 - (0 669) (0 623)yV(0 552) (0 61^ = 0 133, 
indicating an extremely low relationship ^is '"f ^ 

as meaning diat the relationship between husbands and wi 

after training is mainly due to factors present before 
partial correlation of 0 133 « significant for samples of »“= ' > 
could also say that training tended to make husbands scores and 
scores more similar with respect to relative ranking in their respect, 
groups 


ESTIMATION WITH ORDER STATISTICS 


Percenttles (1, 9) 


One of the most intcrcsung results from order statistics is tlmt the 
expected proportion of the population falling between two o ere 
observations is l/(n + 1), where n is the size of the sample— i ^ 
sample is expected to divide the population into (n + 1) groups 
equal size Because of this, the percentile points of the samp ® ® 
estimates of the percentile points of the population For clarity we w ^ 
define ^<in as the »th smallest observation m a sample of size n o 
example, with a sample of 19, the fifth observation from the otto , 
A'sm IS an estimate of the lower quartile, the twenty-fifth ^rcen 
point, and in general, the ith observation from the bottom, 
estimate of the 100i/(n + 1) percentile pomt For a sample o , 
example, the closest observation to the twenty fifth percen i 
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the 24 percentile. 

fro. ore bouo.. AW. «^;"erval for -he fOO.h pe. 

If we require a 95-perc observation (r < h r\ 

ccntile, we look for the r an percent of the samples o siz 
si that tn app™x|n.a ,.e be.een V,. 

draw, the pth P'-?""*" °i ,he%a«on 

AT.,. Any r and r that satisfy the eq 

* 1 theore'U In most 

wdldeterminesncha;.,.and„^.^^^^^^^^ 

cases we shall not he H a 95 -pcrccnt upp^r 

exactly, but we shall n ua ‘'l„ht want a number 

For a sample of size 1 i , jcentile-i ' > w he larger than 

confidence level for the „e could draw wou 

which. 95 P««1-Jt;Tp^^tion Here we choose r equal ^ 

the tenth percentile of th P P ^rst term \ o J 

and sum terms untU we reach 0 95 ^-allen sam^ 

„ 0 » « •> 

observation is hrgn than tn ^ ^ ,j,c sum 

second term is ('”) smaUest »-P'-‘:;rnr‘-e 

This IS the probabih^ th- ,,, pupula.ion 

larger than /io)r (9/10)«.» « 1 ’^^’ 0 9804 INe see 

third term is (. l J first four terms 

terms is 0 929, and the su ercentile is Ik bability 

that the probability that the t^^^ P^ ” ut 0 99 

smallest observauon m observation i intervals 

that It IS less than the « sh to find ^pproxt- 

When we have 'aeS' ^P„,„.,h and eightieth, 
for percentiles between percentile i' J 

mation can be used , for 

A 95 -percent e°nfidence ^ ^ f + ,) /lU 

Xr\n <^th percentile ' . |)/i00l + CO confidence limits 

and 4 ^ approxn"®" P'" poml esnmme is 

Fcr«ample.te.usfind'lwaPJ^^,^ 

of the interval is [1 9 / 
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matcly 7 Thus the 95 percent confidence vnterval for the Iwentic! 
percentile is to ^^#379 The interpretation is this “In 95 percei 
of samples of size 79 drawn from a canlmuaus population, the twcntiei 
percentile of the population will he between the ninth smallest and tl 
twentvthird smallest observations 


Eslmation 0 / the CumuhUie Dtitubulton (7) 

A method based on ihc same theory as the Smirnov test discusst 
earlier is available for estimating the cumulative distribution (ogivi 
of the population from that of the sample In Figure 7, we have plotte 



Fig 7 A comparison of the dniiibution of pretraming scores on the 
Slott Berson attitude scale witfi a normal distribution having 
ihe same mean and variance 
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and women on the attitude 

an example based on the « f ^cr jagsed Hne is .he -- - 

Ihe pop'' »“ ^ , 22/Vit "‘'O'?/ ‘‘ irpo'""'' 'o 

puted by finding ''““p jO. The upper confiden ^^^^^^^^^. The 

our example, flo.90 ehtfted vertically P chifted down- 

the sample cumulative pd sh.ft^^^^^ and 

lower bound is para ^ theoretical 

„ard 20 percentage po ^ J J„„i 3 tive. If, at o.y 

If we ivish to tes B theoretical tt, ,he 

distribution we need mi hypo.h«i. W» the sample 

point. It does n°‘ ' “ the bounds, the "ul' hyP^^ ^^t,„,h c 


mi, does not lie between m- hypo.h<;«s 

evel used in computing 'h^ this distnbnh°';;,j[h mean of 68.22 
ivas drawn from a Pf P“‘^tion of a normal euiv ^_„tmds, 

is the cumulative dis it iies en drawn from such 

and standard error of 20.17^ sample 

wc cannot reject t 
a population. 


CONCLUSION „,3„f hypotheses and 

The relative '"'[’^““e o ^anVmie" “^ti^atToVthr niiutni- 

ludes of differences in d‘«e Pt”',^ view of this chapter 

“ portanee for the^"'"f Jst. The „ population differeiiec, 
ascertain "-here differed , searclnnS f P^tiieally as well a. in 
has been that the '°eial .t„.dly open tn differences and 

must proceed “ max vrsugaf onha^ ^formation aliout 
his field of i"'"e^'„C„h3Uteehntff“«b ® p,t,.„ful estiination 
use of the Smirnov « ^ dismh“tion^ deiermlne .he diilnhulion 

*e shape of the P°Pn^a' ,^„,,h.ch 

procedures bas^on^t^J^p^yrms. „„t.e shoun lha. much of 

curve, can be used biolog*^' ” 

Inscstigations m 
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weU fitted by no™a«. It — 

::rrtr - be ab. . act oa 

our knowledge 
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PART V 


Zke Application of 
Kesearch yindinffs 


In the field of social research, more than in many fields, 
the research workers themselves are frequently concerned with the 
application of their findings and with problems of social action. 
There are two important differences between the social and the 
natuml sciences in this respect. The usefulness of naturaVsdence 
discoveries calls for little or no understanding of the prindples 
involved by the user once the engineers have put the application into 
the form of a mechanical device. In social science, however, the find* 
ings having to do with interpersonal relations cannot be used with- 
out a real understanding of their meaning. The second difference is 
tli.Tt application in social science means changing human behavior, 
and this is in itself part of the \*cry subject matter of our field. 

Moreover, the social scientist becomes involved in the applica. 
tions of his findings because he often maintains a close relationship 
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TVlmt base the scentisu found that -will h p g 

-The findings and .v.ng in and adapt 

'’"^"c“ap"-ve‘l interested primarily m ways in which 
social practitioners and all citirens can utilize the resources ° 
nsvchology to improve personal insight, policymaking p g 
^rning 'and individual and group action Other chapter^ ol h 
volume have dealt with the standards and criteria by which scientists 


learn Irom one another . , , ,, crmiin 

The utilization o{ science will occur only if the person o g P 
somehow becomes ready to look for and to use scientific resources 
in the solving of problems This readiness and J 

to depend upon these three sources of motivation (1) 

sitwity, (2) in image of potentiality, (5) a general experirne 


attitude toward innovation . 

Motivation for the use of scientific findings and metho 
stems simply from the fact that the present state of affairs is uns ^ 
factory for someone Perhaps the chairman of the P T A prograrn 
committee finds that attendance at meetings is showing a downwar 
trend, the business executive discovers that the productivity o us 
plant IS remaining steady or declining rather than showing improie 
ment, the government agency is under attack from Congress to justi y 
the way in which it has been spending funds the solicitors in t le 
Community Chest Drive are not collecting as much money as t ley 
!ia\e presiously a disruptive state of tension exists among et mic 
groups in the community, John Doe feels he is not getting a ica 
Ml life This type of sensitivity to a problem is frequently a reason 
i%hy responsible leaders look outward for sources of help to get a 
deeper understanding of their problem situation and to fin new 
principles, and methods of functioning more effectively 

The image of potentiality is another very important source o 
inuiaiise Perhaps from their oivn imagination, or from observations 
cf situations elsewhere, certain individuals base an idea of i ’’ 
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situation might be much more satisfying than it is now if it i\ere 
different in certain respects Pcriiaps, as a result of talking ivitli some 
other officers, the P.T.A president has become convinced that she 
should not be satisfied \Mth 50 percent attendance at meetings Per 
haps certain clues have convinced the government administiator 
that there is a great deal of wasted effort in what is generally re 
garded as an important and effective program Perhaps for some 
reason the chairman of a community council lias the idea that all 
the competition that goes on between representatives of the various 
community groups who sit on Ins committee is not necessary Per 
haps the manager of a textile plant has read the reports of the expert 
ments by Coch and French (9) and is wondering whether it is possi 
ble that the productivity of his plant, which is regarded as quite 
good by industry standards, could really shoot up in the same way 
In each case, this image or tentative question about potentiality 
stimulates need for fuither information about possibilities 

Some leaders and oiganizations have expliciiy accepted tlie 
standard that a continuous effort to “keep up with new discoveries’ 
and to "try out new ideas” is an impeiative For such persons and 
groups, the utilization of science has become an important goal 
This state of affairs is, of course, not yet common in regard to the 
social sciences, but a growing number of individuals and organizj 
tions are exphcity establishing such a goal 

Even wlicn there is motivaiion to turn to science for help, this is 
just the beginning ConiplcN problems of research interpretation 
and application must also be solved We shall examine these prob 
lems m two types of sitiiacion> (I) whcie there is a desire to applv 
scientific knowledge discovered elsewheie to the solution of a pres 
ent ptoblem, anti (2) where there is a tlcsire to apply research proce 
dures dnectly to help solve ihc picsent problem We are making 
this distinction because it stems important for the analysis of the 
process of science utilization In the fiist case there are questions 
as to whether and to what extent the research done elsevthere 
applies to the decisions and actions in question Also, there are tpies 
tions of how the research from elsewhere gets connnunicaied to the 
relevant actors in such a v\av lint us piacucal value can be reahsti 
tally assessed and acted on In the second case, we liave the jitoblcms 
of whether ihc research is focused on major dimensions of the prob 
Itm rather than on svniptom' v\h'*ilicr the d in collection activities 
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have been accepted “"“I to other 

utilize them; and "Aether the rmearchjnd^^gj^^ 

Ta^^yL^tstaS -iew a number oi illustrative cases 

ot social-science utilization. 

USING KNOWLEDGE AND THEORY DERIVED 
from research elsewhere 

As a person or organization turns to the scientific ^ 

help on a problem, he is laced ^*“"“"’1;"“' ‘"flings To what 
ahout the applicability ot reseamh done f ,hose in 

extent and in what ways is his situation comparable 
which the research was done? Does a theoretical pnne p 
L situation also? Is the way ot approaching ^i™ 
can learn from the previous work, or is there also som ^. 5 

cerning the substantive content ot the findings “""J 6 ="”“ " 
Unfortunately, many persons do not rev.ew 
looking tor help trom scientific resources; I" ^ , jip 

scientzL work and its implications because ot 
terences between the situations or populations “P *’'’'** J, ,j,u 
was conducted and their own. Or they may uncritically P , 

findings and insights as relevant in their own situation and proceed 
unsuccessfully on this unrealistic assumption. ^ 

There is another significant question which must • 

How can scientific knowledge about “what causes w at pr 
guidance in doing conaete thinking about what wi app 

Formulating plans /or action on a scientific base often ca * , 

and different sdeniific information than the information nM e 
understand why things are the way they are. The sections w ic 
low attempt to review some of the ways of thinking about t ese * 1 ^ 
lions of applicability of scientific resources and some o t c w y 
in which application can be facilitated. 

The Need for Sufficient Theory 

As the potential user reviews a piece of research done elsewlier^ 
he may or may not discover that the scientist has done enoug 
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generalizing from, or theorizing about, his findings to provide help- 
ful clues. It may sound somewhat paradoxical to state that one ofj 
the ways in which scientists can give their findings concrete signifi-| 
cance for practitioners is to do adequate theorizing about the findings. I 
But this is the case. The spelling out of abstract generalizations which 
emerge from the study of a specific situation provides one of the most 
helpful means of relating these insights to the analysis of other 
situations. The reports by Cartwright (6) on mass persuasion and 
Coch and French (9) on resistance to change are good examples 
of this. Cartwright’s paper starts from the results of the studies of 
the selling of war bonds during World War II, and the Coch and 
French paper starts from a study of the human factors involved 
in technological changeover in a textile plant. In both cases, how- 
ever, tne authors moved to a level of theorizing about the phenomena 
they have studied which makes it possible for a wide range of practi- 
tioners to see how the generalizations apply to the analysis of tbeir 
problems. This Is possible because the concepts used to organize 
and interpret the data are concepts which are easily seen as relevant 
and important in a wide variety of situations. 

-Studies of Widely Distributed Phenomena and Populations 

Studies such as those on the authoritarian personality, on auto- 
cratic and other kinds of leadership, on resistance to technological 
change, or on interpersonal relations between supervisors and work- 
ers have focused on aspects of behavior and social process which 
are important features of a %vide variety of social problems in many 
types of social situations. Theoretical generalizations based on re- 
search dealing with widespread phenomena are likely to have rele- 
vance for a wide variety of practical appiications. Social-psychological 
studies which are focused on phenomena which occur infrequently 
in operating problems and situations are apt to yield generalizations 
of less widespread applicability. This does not mean that these studies 
are less "basic” as contributions to the developing science or to the 
solution of some specific problem. 

* There is another closely related tvay in which the scientist's 
approach to his research helps facilitate the process of research utili- 
zation. This is by the selection for study of situations and popula- 
tions w’hich are’ widely distributed in society, such as industrial 
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Channels of Commumcatton 

But esen .t the research sert.ng r,pe o! problem 
ot the data are advantageous to the providing o p 
relevant insights to a wide variety ot planners 
type of problem may nevertheless be present The 5 

actual communication of such relevant findings “ P°'^" . , ^ j 
sumers who need them Effective communication naust be established 
between relevant social saentific resources and the po e 
ot these resources One help in this direcuon is the vvor. o the 
soaal-science middleman, the science writer A goo sm 
interpreter is able to classify and synthesise research findings so that 
they are more clearly related to the problems posed by opera<'"8 
persons and are related to a wide variety of problems so that rnany 
practitioners can find the inaienal of relevance by reading sue 
overview and can learn vshere to find the sources of data Examples 
of this are the books by Watson (S9) Murphy (35) and Marrow (33) 
Such overviews may also be presented as specially invited pap^ > 
saeniists at the professional meetings of practitioners (25) ver 
views may frequenil> have the stimulus value of creating the image 
of potentiality referred to above, or they may provide someone w o 
is sensitive to a problem vsiUi a direction m which to seek for he p 
In the field of social p 5 )chological research the implications 
of the findings for what to do m a specific social situation are o ten 
complex and likely to be tied up with fears and uncertainties a out 
the consequences of the findings Therefore, more adequate an 
intensive methods of communication are required to stimulate an 
undemanding and acceptance of research findings and their imp i 
cations than in the physical sacnces For this reason the processes 
of demonjtrflijon and of reassurance by peers are often of great im 
portance. For example, a very important community project on t e 
solving of intergroup tensions sponsored an open house vvor s lop 
for leaders from other communities to come and help review an 
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interpret the research Findings and lo observe the procedures being 
used Peer reassuiance is illustrated here by the fact that the visitors 
were able to talk about the feasibility and results of tlie project in 
their own nontechnical language with peers in the experimental 
community who were perceived as the same kind of people we are 
Tins experience encouraged the visiting leaders to try applying some 
of tliese principles in their own situations 

The staff of the Tavistock Institute of Human Rchtions Ins 
described a budding off conference in which representatives of a 
factory where a major research project had been going on for some 
time invited representative visitor teams from seveial other industries 
to come to review what had been going on Labor representatives 
talked to labor members of the host plant, engineers to engineers 
and management to management as the first phase of the conference 
before the social scientists were called on to help analyze what had 
been happening in the project In many cases this type of com 
mumcation is necessary to provide the motivation and insight needed 
for a budding off of findings and methods to new situations 

Wt have mentioned previously that theic u an additional pioh 
icm of translating diagnostic insights about why things ire the v iv 
they are into well formulated hypotheses about what to do ibout 
It and how to test whether these alternative lines of action are coi 
lect and workable in a specific situation This leads us to one of the 
most important processes of science utilization— the use of the scicii 
tist as a consultant In a vast majority of cases the effective carrying 
through of a process of utilization of research findings into integrated 
policy making planning and operations requires active face lo-face 
interaction between a social scientist who serves as ^n inlerpretei 
md consultant and the key operating people involved Such a scien 
tiFic consultant is not primarily a producer of research He is more 
of a social engineer and has a multiple role to perform On the oiu 
Innd he must become familiar enough with the operating problem 
so that he can help reformulate them to make possible a more sci 
entific analysis of them He must have a broad enough orient iti >i 
to social research and theory so that he oin bring the rtkvaiu 
research knowledge to bear on the analysis of the problem ind iht 
prediction of probable consequences of various lines of action He 
must also be able to help set up procedures for measuring an<I asse-.>» 
ing tlie consequences of new lines of action Perliaps most important 
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such a scientist'consultant usually finds that he needs to take the 
role of a trainer of his operating co-workers in "the scientific atti- 
tude” or "scientific approach” to thinking about the new operating 
problems which occur daily. 


Illustrations of Research Utilization 

Now that we have reviewed briefly some of the problems and 
possibilities of motivation and communication involved in applying 
social-science knowledge from other settings to specific operating 
problems, let us look more closely at some examples of attempts to 
apply social science. In a later section we shall summarize what seem 
to us some of the important general principles that underlie these 
illustrations. 

The executive committee of a Community Chest asked a social 
psychologist to help them train their fund solicitors. After look- 
ing into the past practices of the solicitation procedure, the con- 
sultant recommended a number of modifications derived from 
research and theory. For example, he suggested that solicitors par- 
ticipate more actively in the process of setting quota objectives so 
that they would feel psychologically more committed to their objec- 
tives. This was an application from several research findings that 
persons who participate in a group decision are more likely to carry 
through the commitments of this decision than are persons who 
receive assignments or exhortation without participation. Also, the 
the plan included more careful consideration of matching the solici- 
tor to his "targets” in terms of existing status and group relation- 
ships. Research on social influence has indicated the importance 
of prestige and reference-group membership in exerting influence. 
Another application of reference-group theory was a recommenda- 
tion to develop a clearer rationale for expected size of contribution 
in terms of specific subpopulations to which each giver could sec 
himself as belonging. 

The executive committee rejected all such ideas sWth a reaction 
that "we made our quota last year by the previous methods, so we'd 
belter use them again." Psychologically they were unable to (1) 
accept the validity of the data from elsewhere as relevant; (2) accept 
as realistic a goal of "doing better than we have before"; (3) see 
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diagnostically some of the differences between "last year" and "this 
year" as a money-raising situation. 

It is possible that all these hurdles might have been overcome 
had the consultant started out with the objective of creating a need 
for help, rather than assuming that this already existed. Or he might 
have been successful in suggesting a policy of "trying out" a modified 
approach with one part of the population. 

COMPLACENCY SHOCK. As wc have indicated previously, it is 
often difficult for the individual or the group to perceive and accept 
the fact that the operating situation needs improvement. The follow- 
ing case illustrates some of the elements involved in this type of 
situation where there is no strong problem sensitivity or image of 
potentiality: 

The Mutual Security Agency has invited to this country numer- 
ous productivity teams from the various industries of the Marshall 
Plan countries. These teams of twelve are delegated to make a study 
of productivity in several American plants in order to find ways of 
improving the productivity in their own industry. As might be 
expected, there is in many cases a strong tendency to see any higher 
productivity in the American plants as due to special advantages not 
possible abroad, such as superior equipment or raw materials. In 
order to facilitate comparative analysis of their own plants and the 
American plants, the members of the team from one industry were 
asked to make a special pretrip study of productivity in their own 
factories by going around as a team visiting the various factories 
from which they came. They made a careful record of productivity 
data and of the various manufacturing procedures in their plants. 
The first plant they visited in the United States had about three 
times greater productivity per worker than their own plants. It was 
quite easy, however, for them to point out a variety of factors of 
superior equipment which they felt could account for the differences. 
They were then taken to a small plant comparable in size and 
equipment to a typical plant in their own country. After looking 
at this situation, they agreed that it was comparable to their own 
and therefore that a comparison of the figures on productivity should 
be valid. The analysis of productivity again indicated that the pro- 
ductivity per worker was two to three limes as much as in their 
own factories. In the face of this situation, their complacency was 
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shattered They became eager to leam how to apply the discos cues 
and mnoations oE the American plants m their own This illustra 
lion seems to be typical of many situations in which an individual 
or an organization has overtly expressed a desire to improve the 
present situation but is unable to accept various ideas for impiovc 
inent as relevant for themselves until they can see that the improve 
menis have been successfully executed by someone ‘in their own 
league When this factor is accepted, the data which have been 
discovered elsewhere are accepted as relevant and applicable to the 
analysis and evaluation of their own situation 

A RESEARCH APPLICATION CONFERENCE Let US tUm nOW lO SCV 
eral illustrations of attempts to communicate research findings in a 
way that would clarify their relevance to social practice In the first 
example, a group of adult education practitioners decided they 
would like to explore what help they could get m thinking about 
their professional operating problems from interaction with a 
selected group of social scientists They organized a two day confer 
ence which had the following design 

During the first phase of the conference, the practitioners sii 
around the table taking a census of what they regarded as rnticnl 
professional problems, attempting to arrive at some consensus con 
cerning the most important problems as they saw them They 
attempted to formulate the nature of these problems and the 
causes as they understood them During this phase, the five or 
six social scientists representing various disciplines, from amhro 
pology to psychology, had the job of listening and keeping notes 
on their inierprciaiions of the nature and the causes of the problems 
making cross references to any relevant social saence research and 
theorizing which might be helpful in shedding light on the various 
problems which were being explored 

During the second phase of the conference, the social scientists 
look the center of the stage to interact with one another They 
shared and integrated their observations about the crucial dinien 
sions of the operating problems which had been discussed, the sMy 
in which these problems might be lelated to deeper underlying prob 
Icms OTj v>hich research had been done, and the kind of generalizi 
tions which might be tentatively advanced as guide lines for clarify 
tng^thc nature of the operating problems A wide variety of researches 
and research-derived generalizations vsere brought to bear and there 
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uas considerable consensus in reformulating many of the operating 
problems During this phase, the practitioners were primarily listen 
ers although they asked questions of clarification 

During the third phase of the conference the whole group 
worked together on three activities («) redefining the nature of the 
basic operating problems in adult education (h) formulating some 
general principles for the improvement of practice, and (c) identify 
mg certain areas of needed research which should be conducted in 
adult education settings to test the value of certain theories deve! 
oped in other settings or to open up new fields of knowledge which 
had not been explored in other settings 

The participants were in general agreement that this type of 
communication situation was very valuable and successful in clarify 
ing the revelance of social science research to various operating prob 
lems and in reformulating operating problems so that they could be 
related to basic scientific findings and theory Several problems of 
making this kind of interaction a successful communication process 
were identified It was dear, for example, that the discussion co 
ordinator needed, if possible to be sensitive to the frame of reference 
of both the social scientists and the adult educators in order to 
identify and help clarify points of noncommunication and to help 
both groups find rewards in this type of interaction situation It 
seemed to be particularly important to help members of both groups 
clarify their roles as listeners and as actors m various phases of the 
communication process The coordinator also had to help give the 
conference a continuous movement toward applying generalizations 
from research to operating situations and formulating the major 
problems for further research 

A RESEARCH REVIEW' CONFERENCE OuF SCCOnd illustration ol 
research communication is a somewhat different type of research 
utilization conference This was a one day meeting at which a hetero 
geneous group of fifty community social welfare leaders met to listen 
to a review of research on leadership and group dynamics 

During the first half hour the conference leader discussed vMth 
the community leaders ways in which they might plan to get ns much 
as possible out of the research which would be presented It wns 
agreed that they would listen with an active interest in making notes 
on points in the research review at which each of them got glimmer 
mgs of possible relevance to situations or operating problems about 
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which they were concerned. At the end of the research review they 
would convene in committees of six or seven to share their thoughts 
and observations in order to formulate (1) questions of clarification 
to the social scientist, and (2) tentative generalizations which they 
would draw from the research review concerning operating prob- 
lems. These they would test by getting the reaction of the research 
scientist to the validity of their generalizations and the possible 
extension of the generalizations. 

The research review, which took about an hour and a half, 
integrated a great variety of studies under several research topic 
headings such as Leadership, Communication, Decision-making, 
Participation, etc. Under each heading a variety of empirical results 
were presented and a number of theoretical generalizations were 
formulated. During the next hour, all the subcommittees held very 
active discussions, each committee having a recorder-reporter. The 
following period of interaction between the group reporters and 
the social scientists was also very active, although rather frustrating 
at times for the social scientist. The subcommittees had been very 
inventive and aeative in applying and extending various research 
generalizations, and he found himself in a position of challenging 
some of the applications which were being suggested. He felt that 
they needed considerable testing before they could be applied. He 
was active in suggesting boundaries to generalizations and dangers 
of oversimplifying the complexities of operating situations by think- 
ing only in terms of the interaction of two or three variables rather 
than of many additional ones which might be just as important and 
tvhich had not been explored in the research which he had been 
reporting. 

As a whole, the conference seemed to strike a balance between 
providing new insights into the analysis of operating problems and 
training m the scientific attitude” of thinking about some of the 
boundaries of generalizing from one situation and set of variables 
to another. The participants in the discussion indicated a high 
degree of satisfaction with the results. The research reviewer and the 
conference leader felt that there were a number of weaknesses which 
could be improved upon next time. They felt the whole problem 
of adequate research utilization had not been grasped because no 
single illustration had been carried all the way through to an 
examination of the problems of acting on new ideas and insights 
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about a problem. They also felt that the research reviewer had not 
had enough opportunity to get insight into the complexities of some 
of the operating problems which were being talked about so that he 
could sense more thoroughly the relevance or lack of relevance of 
some of the research findings and theory which he had been report- 
ing. It was also observed that the participants in the discussion 
became quite motivated to read more social-psychological literature, 
but the research reviewer had not been adequately prepared to sug- 
gest available types of reading which might provide good follow-up. 

FOCUSING ON A SPECIFIC OPERATING PROBLEM. During World 
War II, one of the present writer had responsibility for a training 
program in which it was necessary to bring the disciplines of anthro- 
pology, social psychology, psychiatry, economics, political science, 
journalism, and geogrt'phy to bear on the solution of a specific type 
of operating problem. To integrate this diverse spectrum of informa- 
tion into a focused set of operating knowledges, attitudes, and skills, 
it was necessary to train each group of operating personnel (about 
a dozen in each subgroup) to operate as a learning team in utilizing 
information from many different sdentist experts. For example, 
before a political science specialist arrived, the operating group 
worked together for a class period on the preparation of a group 
interview schedule of questions which would be covered and cross- 
checked in conversational style during the group discussion with the 
specialist. The findings of the group discussion would be summarized 
after the specialist left, and hypotheses would be formulated for 
further exploration and cross-checking with other scientists. 

In some cases the trainees needed to convert the information 
tiief’ sequired ineo specific interpefsans! sktHs. Far example, chef' 
needed to get from the anthropologists the kind of information 
which would actually help them behave appropriately in cross-cul- 
tural contacts with representatives of a culture very different from 
their own. They found that descriptions of appropriate behavior 
acquired by questioning the anthropologists were not easy to trans- 
late into actual behavior. They found it necessary and effective, 
therefore, to set up role-playing to deal with specific cross-cultural 
contact situations, with the social scientist giving demonstrations 
and providing at-the-clbow supervision in appropriate behavior pat- 
terns. 

Another procedure for cross-disciplinary integration used with 
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ihcsc w \inees was lo give them a specific operational problem lo 
solve, in which their first job was to make decisions as to what kind 
of information they needed and from what type ol social scientists 
and to formulate a procedure for getting this information from such 
scientific personnel The scientists were then called in and the group 
had to make good use of them and integrate the material into a set 
of operating decisions and actions It was the judgment of the three 
leaders of this training program that the focus on the solution of 
specific operating problems was an indispensable requirement for 
the successful integration and use of the resources of the various 
social sciences One of the problems of such an approach was that 
of training ’ the social scientists to enter into this group interview 
procedure m which they were asked to focus their thinking and 
backg-ound knowledge on certain functional questions rather than 
to present prepared lectures organizing the substantive content of 
their specialty around more general or abstract topic headings Most 
of them proved quite ready to make the adjustment and were able 
to contribute in a very fruitful way. 

PRODUCING AN APPLIED SOCIAL SCIENCE PERIODICAL. AnOthCP 
example of the interpretation of the social sciences is illustrated 
in the production of the monthly periodical Adult Leadership The 
professional association which has responsibility for the produc 
non of this periodical has conceived its mam function as communi 
eating to lay leaders principles and techniques for performing 
their job of community and organizational leadership It was felt 
that this objective called for drawing on the resources of the social 
sciences in order to produce valid content for such a periodical On 
the other hand, it was recognized that most social scientists would 
not have the time or the skill to communicate clearly through the 
type of writing demanded for such a popular periodical Therefore, 
an Operating Committee was established which was composed of 
representatives from the social sciences from training methodology, 
from magazine publication and art layout, from public relations, and 
from the operating field of leadership practice This group was given 
the major rcsponsibihij for thrashing out with the small periodical 
stafi the major policies and procedures for the content and produc* 
non of the periodical 

In the early stages of production fervor, a number of things 
happened which helped to put the microscope on certain problems 
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of social-science utilization. For example, after agreement was 
readied concerning the over-all outline of content for a specific 
issue of the periodical, some of the social-scientist members of the 
Committee undertook to prepare certain articles, and other members 
of the committee and staff with specialized writing skills undertook 
to pioduce other units, after a briefing in the committee meetings 
about the main themes of the content. Tension became somewhat 
Iiigh when the wTiters told the social scientists that their productions 
could not be accepted without rewriting and the social scientists 
fold the writers that their contributions oversimplified the basic 
data which the article had been designed to interpret. Out of these 
first efforts have come a number of interesting production patterns. 
For example, certain members of the periodical staff have now 
acquired skills in briefing social scientists about what kind of con- 
tent is needed in rough draft form to provide the basis for a writing 
job by a professional writer, whose effort will be reviewed by one 
or more social sciezitists. Another successful pattern of prodiutiou 
-has been to set up a briefing session during which one or more social 
scientists think out loud about what the content and sequence 
of a particular article might be while the professional communica- 
tor takes notes, asks questions of clarification, checks whether he is 
getting the ideas by rehearsing them out loud, and then wTites up 
a first draft of the article for review. Out of this kind of interaction 
has grown a very real mutual respect on the part of both parties 
and a much deeper appreciation of the problems involved in an ade- 
quate process of transition from research findings and scientific 
theory to communications of functional significance to laymen who 
have operating jobs of leadership to perform. 

DIRECT CONSULTATION ON A SOLUTION OF AN OPERATING PRORLEM. 

In one city, the school population bad grown rapidly and there was 
need for an expansion of school fadlicies. .Arrangements had been 
made for a local election in which voters would have an opportunity 
to vote for or against new schools and also for or against a bond 
issue to finance these schools 

Although the local Parent-Teacher Association was ihorouglily 
familiar with the reasons for the need for the nesv school and svas 
particularly anxious that the ouicome of the vote be favorable, they 
felt tliat the vote should represent the desire of the entire community. 
Therefore, they decided to start a campaign to inform people about 
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the facts to be voted upon and to get out the vote. In planning 
the campaign, the head of the local P.T.A. sought the advice of a 
social psychologist on what the P.T.A. should do in order to be cer- 
tain that the voters acted when the time came to vote. 

This social psychologist had had research e.xperience during 
the war on the problem of why people did or did not buy war bonds. 
He knew that the studies of the several war bond campaigns had 
shown that the mass media were useful in helping people to under- 
stand the reasons why it was important for them to buy bonds. But 
he also knew that a major finding of the studies of the campaigns 
was that people were likely to act by buying a bond only when they 
were penonally asked to do so. 

In advising the local P.T.A., he drew upon his knowledge of these 
and related results. He urged them to plan a campaign utilizing both 
ihe mass media and personal solicitation but with emphasis on the 
latter. He suggested that they ask the local newspaper and radio sta- 
tions to carry as much information as possible on the facts of the 
situation and to repeat this information in somewhat different 
stories on different days so as to increase the probability that people 
would become aware of the election and the issues involved. But he 
emphasized, in talking with the P.T.A., that, if they wished to moti- 
vate people to participate in voting, it would be very important to 
have each household called upon by a volunteer from the P.T.A to 
encourage every eligible voter in the household to vote. Acting in a 
vigorous manner upon his advice, the P.T.A. organized an informa- 
tion program about the need for additional schools and the cost of 
these additions which was widely disseminated through the coopera- 
tion of the local mass media. Most important of all, however, the local 
P.T.A. organized a campaign in which every household through- 
out the city vras called upon. In these personal calls on individual 
households, neighbors of the person called upon gave him facts 
about the situation and urged him to be sure to vote in the forth- 
coming election. The effect of this campaign with its house-io-housc 
solicitation was a large vote in the election and one which svas 
overwhelmingly in favor of the additional schools and of the bond 
issue to finance them. 

AN IN-SERVICE SEMINAR. A final illustration of methods of re- 
search tnierprciation is a combination of sdentific discoveries from 
other situations and the findings of research conducted svithin an 
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agency to help understand and apply the local results. A social-science 
team was asked to undertake a program of research to evaluate the 
effectiveness of the foreign-assistance program of one of the govern- 
ment agencies. It became clear that in order to understand and use 
the type of results which would be coming from the study it would 
be essential that the program planning staff and key administrators 
of the agency understand some of the dynamics underlying attitude 
formation, group action, and resistance to change. Before the study 
was completed, therefore, the key members of the staff were given an 
opportunity on a volunteer basis to participace in a semfnar on 
“changing attitudes and behavior." 

The seminar tvas restricted to 15 participants who would com- 
mit themselves to attend all the sessions in spite of the operating 
pressures of their daily jobs. The one session a week was held several 
blocks away from the office building. The ground rules of the seminar 
\vere that research findings and general theoretical principles which 
might be relevant to the operating problems of the various mem- 
bers of the group would be presented for discussion, but the 
discussion would not wander off into the specific discussion of any 
of the operating problems of the members 'of the group until the 
final sessions of the seminar. The last half hour of each two*hour 
period, however, was set aside for “brain storming." In this period 
each member was asked to free-associate any connections he had 
made or was able to make between the things that had been re- 
ported and discussed in the session and his own operating situation. 
The seminar leaders kept a record of these free-association connec- 
tions for later review and discussion when there was a specific focus 
on the review and analysis of operating problems of the agency. 
This review also provided a transition to looking at the research 
findings of the evaluation study of their own program. It was the 
feeling of the research team that these “background theory and 
research report sessions” provided the kind of sensitivity and per- 
spective which helped this group of operating persons to be able 
to come to grips effectively and objectively with tne findings of the 
research conducted in their own situation. 

THE ANALYSIS OF THE siiTJATiON. Onc of the most difficult and 
imjrortant problems for the social scientist who is scrying as a con- 
sultant is that of getting an accurate picture quickly of just what 
the operating problem is so that he may be able to select and 



598 The Application of Research Findings 

interpret relevant research results and theoretical generalization' 
developed elsewhere. The conscientious research scientist is quite 
likely to take the point of view that this is an impossible task with- 
out an additional comprehensive research diagnosis based on meas- 
urement of the present situation. Frequently this is correct, and 
this will be the focus of the second section of the chapter. Fre- 
quently, however, this approach is not feasible and would require 
quite an impractical delay in the decision-making and planning 
demands. Therefore, the social scientist consultant is frequently 
faced with the necessity of “getting the picture" as quickly and as 
objectively as possible. The following case illustrates two techniques 
of tackling this problem: 

The social scientist was asked to consult with the administration 
of a large hospital on any relevant knowledge there might be which 
would be helpful in reducing the iniergroup tensions among the 
nurses, doctors, and attendants, in such a way that there would be 
improved care of the patients. During one afternoon the consultant 
conducted three "problem census" group sessions. In one session 
he met with a group of medical personnel, in another with a group 
of nursing personnel, and in the third with a group of attendants. 
In each session he shared his problem of wanting to get valid data 
on the problems each subpopulation felt it had of cooperation 
with the other subpopulaiions, in the interests of serving the patient 
He selected three persons at random in each group, explaining that 
while he interviewed them in front of the rest of the group he 
would like to have all the other members jot down whether they 
agreed with or disagreed with the responses of the interviewee and 
what modifications and additions they would make in their own 
response. The data would be reported orally if there was time but 
would be handed in if there svas not sufficient time. 

'Hie interviews were then conducted with a preplanned set of 
questions and proceeded slowly enough so that the rest of the group 
could formulate the variations in their responses. Most of them were 
very active in doing this. The interviews and supplementary data 
indicated that most of the tensions and low morale stemmed from 
interaction situations which occurred in the daily routines of hos- 
pital life. It seemed to the consultant from these reports that prob- 
ably a good deal of the difficulty had to do with noncommunication 
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of expectations, nonparticipation in decision-making, or noncom- 
munication of decisions which had been made. 

To check further whether these were the most relevant phe- 
nomena, the consultant recruited several volunteers from each role 
to recreate some typical interaction episodes, using role-playing 
techniques. From his observations of these situations, the consultant 
felt fairly secure about some of the major variables which were 
involved and was able to summarize some of the relevant research 
from other social hierarchical situations, and to make some general 
interpretations about the problems involved. Also, he suggested 
from experience elsewhere that key personnel be actively involved 
in an actual research project in which they would participate in a 
more thorough diagnosis of the dynamics of their own problem 
situation. This step takes us into the next section of this chapter, 
on the utilization of research done within the operating situation. 
The main point which this brief example illustrates is that the 
consultant can use a variety of techniques to get at least crude data 
on the particular operating problem so that he can more appropri- 
ately mobilize the scientific resources of research and theorizing 
done elsewhere which may be relevant in stimulating insights about 
the present problem. 

Interpretative Review of Case Illustrations 

IMPORTANT VARIATIONS IN THE APPLICATION SITUATION. One of 
the important facts we can note from the variety of illustrations 
above is that the psychological readiness and ability of the partici- 
pants to use social science and social scientists differs greatly from 
situation to situation. We have noted that in some cases the oper- 
ating personnel have taken initiative to seek new knowledge and 
ways of applying it, whereas in other cases this sensitivity to poten- 
tialities does not exist. Also where this desire to seek new infonna- 
tion does exist, in some situations it is a general attitude of curiosity 
about possibilities of application (such as in the research-application 
conference and the research-review conference) rather than a search 
for help on a specific problem (such as the school-bond campaign 
or the improvement of the morale in the hospital). The task of the 
social scientist is somewhat different when the practitioners ask him. 
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"Tell us in general what’s going on and what new things are being 
discovered in your field,” from what it is when he is asked, “Do 
you know of any knowledge that wiU help us in solving this prob- 
lem?” We can note another difference in our case illustrations 
between those situations in tvhich the social scientist is primarily 
an information-giver and situations in which he is an active con- 
sultant in guiding the process of research application and doing an 
active job of interpreting scientific method and the generalization 
of scientific findings from one situation to another. 

Our case illustrations emphasize the importance of this more . 
active role for the scientist in applying scientific knowledge and 
theory in an appropriate and effective manner. This does not mean 
that many important applications have not also emerged from a 
practitioner’s reading some book of social-science interpretation or 
hearing some research-interpretation speech. Such illustrations are 
not very likely to come to our attention. It is our hunch, however, 
that the complexities of social-science research utilization are so 
great that in most cases it re<{uires the very active teamwork of 
motivated practitioners and well-oriented social scientists to bring 
about intelligent application of available scientific knowledge and 
theory. 

THE PROCESS OF RESEARCH OTiuzATioN. A rcvicw of the fore- 
going case illustrations from a different point of view reveals a 
number of things that rather typically need to happen if successful 
application of scientific research and theory is to result.- Even 
though the situations did differ greatly from one another, the fol- 
lowing common elements seem to be necessary or at least highly 
desirable: 

(1) There needed to be the motivation to seek and use scientific 
resources. In situations where this motivation did not exist, it needed 
to be stimulated by demonstrations of potentiality, by complacency 
shock, and other approaches. This seemed to be the first step before 
further progress could be made. 

(2) Then an active process of redefining and reformulating 
operating problems was required so that the relevance of scientific 
research done elsewhere could be perceived. In the research-appli- 
cation conference the team of social scientists did an active job of 
first listening to the practitioners’ statements of their problems and 
then aiteroptine to break these problems down and reformulate them 
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for scientific analysis. In the case of the research-review conference 
and of the administrative in-servia seminar the practitioners took 
the initiative in attempting to relate their operating problems, as 
they saw them, to various research problems which had been worked 
on. 

(3) In nearly all of the case illustrations, we saw that the social 
scientist, in order to try to be helpful, had to do an active job of 
getting oriented to the action problem as the practitioner saw it 
in order that he might do an intelligent job of selecting appropriate 
scientific resources for application to this particular situation. He 
had to be a skilled listener and interrogator, and in one case he bad 
the persons involved recreate some of the typical social interactions 
of the problem situation in order to get an appropriate orientation. 
In many cases of sodal-science application an even more thorough 
diagnostic research job is necessary to -get the facts about the oper- 
ating situation. This type of research is discussed in the next section 
of this chapter, where we deal with the utilization of research con- 
ducted within the setting. 

(4) It usually is helpful for the social scientist to communicate 
a general orientation or “way of looking at behavioral dynamics” 
as a framework within which to interpret spedfic data. This cer- 
tainly suggests that some types of general education in the behav- 
ioral sdences is needed as a background for effective application of 
research knowledge to specific problems. 

(5) In addition to reformulating operating problems in terms of 
relevant sdentific findings or variables, an active job of thinking 
about generalization and applicability of research knowledge and 
theory is required. This requires that the soa'af scientist, as a con- 
sultant, help the applier gain an understanding of the methodology 
of research application by fadng such questions as comparability of 
populations, comparability of situational dynamics, extrapolation 
of theoretical generalizations to different situations, and experi- 
mental-mindedness in trying new solutions. 

(6) One other important aspect of research application has been 
mentioned. The sdentific knowledge which is related to a given 
operating situation or a problem may be of several types. For 
example, it may give insight into “why things are the way they are,” 
it may give perspective on ”way$ in svhich the situation could be 
different,” or it may give information about “how we can go about 
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changing the situation '* In all cases, successful application requires 
that the applier interpret, plan, and execute speafic steps of action 
in his own situation This requires creative and realistic thinking 
about “what would happen if ‘ Helping formulate correct hypoth 
eses and plans at this point is a challenging task Even though the 
directions for application may be well understood, it may be neces 
sary to acquire new technical or human relations skills to execute 
the desired steps 


RESEARCH METHODS APPLIED TO PROBLEMS 
OF ORGANIZATIONS 

Social science research is being done increasingly in real life 
settings and on problems of major social significance This trend is 
likely to continue as the methodologies and substantive findings of 
the social sciences become more extensive and more able to shed 
important light on complex problems But if an organization is to 
derue full benefit from any research done on its problems, expen 
ence suggests that the work must be organized, planned, and con 
ducted in accordance with certain patterns and principles The rest 
o t is chapter is devoted to a consideration of these patterns and 
principles It is written primarily from the standpoint of a research 
staff which conducts sample surveys in, or for, an operating organ 
izaiion but the principles stated are believed to be generall) 


fl Working Relationship 

in ^ COOPERATIVE ATMOSPHERE An important problcpi 

the resear m an organizaUon concerns the atutude of 

f “^San.zat.on and m penonnel S.nce 

3 ■" and values, a researcher 

values and r ^°dduct research in an organization whose 

Slfc^ circuL? ™ ■" own In 

r et- h,Tl. “■ “ ■' '•'« he act in a way that makes 

T'’"“ '''' °fg=>niration and of its 

“"'Pl«oly opee With them In 
phoning and conducting the research and in reporting the results 
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to the organization, he obtain better cooperation v.hen he dis- 
plays a sensitne appreciation of the ^-alues of the organization 

This does not mean that the researcher, himself, need feel com 
pelled to accept and be guided personally by the values of the 
organizaaon. He can maintain his own values and his personal 
integrity, provided, of course, that they are not in direct conflict 
with those of the organization When th" researcher's values and 
those of the organization are in dirort con'lict, he usually is vusc in 
not conducting research for the organization. 

The major purpose from an organization s standpoint for con 
ducting research is to improve its operauon But improvement al 
wa)-s requires change. This problem is not new Organizations are 
continuous!) undergoing some change. A!1 that research does is to 
increase the amount and magmtudc of change- AH the problems 
involved m changing the activities of an organization, consequently, 
are present when attempts are made to apply research findings 

It IS common expenenre that orders by themscJves are seldom 
suffiaent to produce efleccive change m an organization and its 
functioning Other procedures, including those which make some 
use of partiapauon, are usually required The persons vsho need 
to seek, the partiapation or cooperauon of the others are those 
persons who possess information as to what changes might bring 
improvement. WTien this infomxation is b«sed on research it is the 
researchers, comcquentl) who arc pnmarilj faced with the prob- 
lem of obtaining the partiapauon and cooperation of the others 
tf the research results are to be applied successfully ^Io^eove^, 
appl)ing new ideas requires not only a knowledge of the new idea 
but also a full undc*^canding of the present operauon The research 
staff, cxmsequently, faces the problem of obtaining partiapauon not 
only to fealitaie cooperation in bringing about desirable changes 
but also to be sure that the changes sought represent the best avail 
able thinking based both upon past experience and current research 
findings 

Much of the rest of this chapter will be devoted to a consider 
auon of how to obtain this kind of partiapauon Cooperation in 
seeking and achieving change grows out of honest participation 
With full re co g n ition and appreaation of th** important ideas that 
the many kinds of people imoh-cd can contribute Cooperation is 
not created by manipulation— at least not for long 
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AVOIDING RESISTANCE. When an oi^nization is contemplating 
having research undertaken for it, there may be some important 
persons in influential positions Who view the proposed research with 
some reservations. It is well not to proceed with the research until 
these critidsqis and reservations have been examined fully and in 
relation to the advantages and disadvantages of undertaking the 
research. Their resistance will manifest itself sooner or later, and it 
usually is better to have it out in the open and fully faced early in 
the proceedings. Often, if this is done candidly and unemotionally, 
tliese persons will become more and more involved in the research 
and increasingly favorable in their attitude toward it. If resistance is 
ignored or brushed aside it is likely to result in efforts to stop the 
the research if difficulties are encountered, or it may result in at- 
tempts to block applications of the re^arch. 

creating realistic expectations. Just as some persons are 
unduly skeptical as to the probable value of the results that research 
will yield, others are unduly opUmtstic. This latter group tends to 
have unrealistic ^peciations as to what research can do for them 
or for the organization. Tliis can result in serious difficulties, for 
there are aspirations that even the best research cannot possibly 
a icvc. If people within an organimtion maintain these uiu'ealistic 
expectations, they are bound to be disappointed with the results 
obtain^ from any study, no matter how good these results are. 
J>uch disappointment may lead them to reject the idea of usine 
research in the future. 


In order to avoid the disappointment which occurs when unreal 
pectauons exist, ‘it is important to create expectations that are 
cTpatTin*' to the probable contributions from research. The 

olannin ° moderate expectations is best done during the 

£ the research project. This can be done while 

lo ftKf ™ ^ Studied and the probable character of the results 

When expectations are modest, the 

This encoura«^^h results are likely to be greater than anticipated. 
This encourages the furtlier use of research. 


Organization of Research Relatiomkips 


AN XNTOtNAL STAFF VS. AN OimiDE 
There is neitlier a simple nor a universal 


RESEARCH ORGANIZATION, 
answer to the question of 
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whether research should be conducted by a unit in the organization 
or by an outside research staff. The answer depends upon the prob- 
lem and upon the situation. Some observ'ations can be made, how- 
ever, which may be useful in helping to decide which is better in a 
specific situation. It may be well to make these observations by 
stating some questions and commenting upon them briefly. 

Is a research staff located within the organization or is a 
research staff from outside the organization more likely to: 

(1) be able to undertake and conduct the research with the 
greater objectivity? 

Usually an outside research staff, particularly if it is from 
a well-known research institution, is able to resist with 
greater success any pressures likely to result in a loss of 
objectivity. 

(2) be able to focus the research on the fundamental dimensions 
of the problems being considered? 

Often the officers of an organization are so close to the 
problem involved that they tend to see only the immedi- 
ate problems and often confuse symptoms and basic 
causal factors. The research is more likely to be produc- 
tive if it is focused on the causal factors rather than 
symptoms. An outside research staff, particularly if it 
comes from a well-known and respected research institu- 
tion, is often in a better position to focus the research 
on the causal factors rather than symptoms or immediate 
operating problems of a transitory nature. An inside re- 
search staff of high presti^ and power may, at times, be 
equally successful in being able to focus the research on 
the basic problems. But it is difficult for an inside research 
staff to question and do research on the assumptions and 
underlying philosophy upon which its top management is 
operating. 

(3) have a better knowledge of the problem and of all its rami- 
fications? 

Although an inside research staff may at times be handi- 
capped by not being able to cackle a problem in terms of 
its more important dimensions, it nevertheless usually 
knows more than an outside research staff about the 
problems of an organization. An inside research staff 
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usually has an appreciable amount of information -avail- 
able about the problems of the organization, their history, 
and some of the major developments with regard to them. 

(4) receive the full cooperation of all of the persons involved? 
An inside organization usually has closer contacts with 
the personnel whose cooperation will be important to the 
research. If an atmosphere of confidence and trust perme- 
ates the organization, an inside research staff is likely to 
obtain the better cooperation. On the other hand, if 
conflict and a substantial amount of fear and distrust are 
present, an outside organization with a reputation for 
objectivity and integrity is likely to be able to obtain the 
better cooperation. In general, any research operation 
which depends upon obtaining data from people’s re- 
sponses will obtain full, accurate data only when the 
people whose cooperation is required feel that they can 
trust the research staff. 

(5) have or be able to obtain the personnel required to do the 
research, including personnel with the specialized compe- 
tence that may be required? 

The desirability of creating within an organization a 
trained research staff including specialized technical per- 
sonnel will depend in part upon whether this is a one- 
time study or one of a continuous series of related studies. 
If the organization is continuously conducting a sub- 
stantial amount of such research, it is often advantageous 
for it to create its own research staff. If it is a one time 
study, or if the volume of research undertaken is limited, 
the organization often obtains better and more econom- 
ical research by using an outside research staff rather than 
creating one for the single study. 

(6) be able to exercise the influence required to have the results 
of the research effectively used? 

In some situations an inside research staff has the power 
and prestige to assure that the results are likely to be 
used. In other instances, an outside research staff with an 
excellent reputation is more likely to be able to exert the 
influence required. In deciding whether to use one, or 
the other, or both in a cooperative arrangement, consid- 
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eration can be given to the prestige and power required 
and the amount each is judged to have. Another dimen- 
sion of the problem is probably at least as important as 
the question of an inside vs. an outside research staff. This 
question concerns the status of the person to whom the 
research staff reports. If he is a top officer, there is likely 
to be sufficient influence present to encourage serious 
consideration of the results. If he is a person of minor 
importance, the research findings are much less likely to 
receive consideration. 

SELF-SURVEYS. Tlic sclf-survey (38, 40) is a procedure which has 
merit where the problem is not too complex and where widespread 
participation is advantageous. With this procedure the organization 
t^elf conducts the study with appropriate technical advice and as- 
sistance. Usually a manual is available to tell how, on a step-by-step 
basis, the self-survey is to be conducted. Most organizations also 
obtain the assistance of a competent social scientist to serve as, a 
consultant during the period of the survey. This consultant answers 
ijuestions and provides advice on how to conduct the survey. His 
services include such tasks as helping to train personnel to do the 
^ks required, advising on analysis plans, and helping in the inter- 
pretation of the results. In addition to facilitating participation, the 
self-survey usually has the advantage of costing less than a study 
done by a research staff, or at least of involving less cash outlay. It 
also provides a means of conducting a study when the shortage of 
drained researchers would make it impossible to do any other kind 
of study. 

The self-survey, however, can be used only on a limited variety 
of problems. They must be problems for which it is possible to 
develop a reasonably simple self-survey procedure which will, never- 
^beless, yield results of satisfactory accuracy. A continuous danger 
m using self-surveys is that the utilization of unskilled persons will 
yield results containing serious errors. In situations in which this is 
especially likely to occur or in which the consequences of such errors 
'vould be particularly disastrous, it would be unwise to use the self- 
survey, 

hierarchical level of the research staff. It has been ob- 
«tved that when a research staff is under the direct supervision of 

person whose operation is being affected by the research, the 
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research is often discontinued. To function successfully, a research 
staff must report organizationally to the person who is superior to 
the man whose operation is affected by the research results. This 
principle seems equally applicable to both the physical and the 
social sciences, and for the same reason. As the results of research 
lead to improvements, the head of the operation most affected may 
feel threatened. Insofar as he feels threatened, he is apt to wish to 
have the research discontinued or the results ignored. When this 
occurs, he may discontinue it or bury the research findings if he 
has the authority to do so. Consequently, the research staff in an 
organization should neither report to nor be directly responsible 
to the person whose operation is directly affected by the research but 
should report to the next higher echelon. 

There is another important advantage in having the research 
staff report to the next higher echelon. When the research staff 
reports to the man whose operation is affected by it, he may limit 
th- scope of the problems studied or he may order it to produce 
findings to prove him right in what he is doing. Such restrictions 
make it impossible to conduct research effectively. 

Although the research staff should report to a sufficiently high 
echelon in an organization to protect its stability and integrity, 
all of its activities need not be conducted through channels via this 
level. Usually it is well to use the formal channels in agreeing on 
the nature and scope of a research project, but after that is done 
there is a distinct advanuge in the research staff's establishing direct 
and close contact and communication with all the echelons and 
poups involved in the research. Even in using the formal channels 
in agreeing on the nature of the research project, authority should 
not be used to force the research on the units involved. It usually is 
unwise to undertake research for or in a unit which does not gen- 
uinely want it. When people are forced to cooperate, they are likely 
to give distorted information. 

Problems in Research Design 

BASIC RATiitR THAN supERFiciAt. VARIABLES. It Is not at all Com- 
mon for an organization to request that research be done on some 
problem about which it currently is very much exercised and which 
it feels important. Often, howe\’er, when this problem is examined. 
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it proves not to be the best problem upon which to do research. It 
may be a symptom rather than the underlying problem. It may be 
only one part of a fundamental problem. Or it may not be stated 
in dimensions which permit systematic, quantitative research. 

The first task of a researti staff is to diagnose the situation and 
to prepare a dear statement of objectives for the research. Discussing 
the problem fully with the officers and staff of the organization 
^dlitates this diagnosis. Attempting to state the problem in dimen- 
sions based upon the best available theoretical conceptualizations 
also helps. 

The final statement of the problem and of the objectives of the 
t^search must be acceptable to the organization. Often the discus- 
sions involved in diagnosing the problem lead to a recognition and 
acceptance of the problem as stated in the research objei^ves. Some- 
times, however, this does not occur. If further discussions do not 
lead to an acceptance of the problem as stated, the research staff 
may have to start on a pilot or small-scale study devotrf to a 
peripheral problem but one which the organization recognize and 
about which it is much concerned. From the results obtained in this 
pilot study, the research staff usually can demonstrate the nature 
of the basic problem upon which the major research should be 
concentrated. Often the research staff itself gains a dearer under- 
standing of the basic problem as a result of the pilot study. 

Relationship between theoretical and applied objective* 
'It is impossible to emphasize suffidently that research 
ihe operating problems of an agency need not and, if wd one, 
wUl not be concerned with the symptoms of problerm or witn 
minutiae. Nor will it be concerned solely with finding speahc 
answers to spedfic problems. Evidcna is accumulating poin 

to the advantage of designing research dealing with a spea c op^ 
3 ting problem in such a way that the results can be genera 12 


applied to other related situations. _ . 

If the sdentist doing research for an a^cy see on y 
spedfic answers to spedfic problems, he is likely to run in o 
difficulties. One difficulty, for example, is that [ 

specific problem, *at he be 

that the cost of the research « likely to exce^ thecae of th^ 

Spedfic answers. But pardcularly important “ ^ particular 
time the research has provided a specfic anstver to a parocola 
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problem, the situation will have so changed that the original prob- 
lem IS no longer the problem New ones have replaced it (11) By 
inference, this indicates that research designed to meet the long 
range problems of an organization will be more valuable and have 
greater application than ' fire fighting' research designed to meet 
immediate problems 

The great value of generalized knowledge for dealing with 
specific problems was well stated in a remark attributed to Kurt 
Lewin Nothing is so practical as a good theory ’ In the design of 
research focused on major variables, the probability of significant 
findings IS increased if the best available theory is used as a guide 
as to what to measure and svhat relationships to test The better the 
theorj used in guiding the research design, the greater is the prob 
ability of finding marked and important relationships Obviously, 
the more that the research discovers about those major variables 
which ha\e a marked relationship to the problem being studied, 
the greater is tlie contribution of the research to solving this and 
related problems 


Generalizations or statements of principles which summarize 
the marked, important relationships discovered m the research have 
two valuable uses They serve as guides to help solve problems like 
the one upon which the research was focused They also make a 
contribution to available scientific knowledge and the developmc nt 
of theory Cartwrights Some Principles of Mass Persuasion is a 
good illustration (6, 7) 

'“'‘“““"V'' changes m the character of the 

j tsT ^*^^ 1 to occur between the beginnings of the research 
to talcp ^ ^ results It IS necessary for the researcher 

conrpnfMt** ^^ount m designing his research In addition to 
will finri ^■“^reh on the major variables m\olved he often 
atlpoiiatfiv design his research so that the results will 

reasonable range of change that may occur 
nf A »!, >^r>ng the time required for the research One way 
of doing this IS to design the research so that it will yield results 
analytically with two or more widely differing 
,u u * assumptions involve situations more extreme 

than any that is likely to occur, then the actual situation at the end 
of the study will fall between the extreme situations assumed By 
rac eting is problem in this manner and having adequate data 
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to deal with a range of situations, the researcher usually is able to 
make valid and useful derivations from his findings to the situation 
that exists when the research findings become available. 

RESEARCH ON PRINCIPLES AND PROCESSES. In organizational re- 
search (see Case A, pp. 620-6S0), it is well for the research staff to 
emphasize and re-emphasize that the objective of the research is to 
discover the relative effectiveness of different methods and prind- 
pies and that the study is in no way an attempt to perform a policing 
function. The emphasis must be on discovering what principles and 
methods work best and why, and not on finding and reporting which 
individuals are doirig their jobs well or poorly. 

Unless these objectives are made clear to all and rigorously 
adhered to, it will not be possible for the research staff to obtain 
the full cooperation that it needs from the people in the organiza- 
tion being studied. It is important for the research staff to make 
clear to every person that the interviews, questionnaires, and other 
data obtained from each person will be kept strictly confidential. 
People need to know that these materials are being collected for 
purposes of statistical analysis and that no one will be able to lell 
which specific answers were given by which individual. 

The commitment of confidentiality must be clearly given and 
strictly adhered to. This at times may require the research staff not 
to report separately the data for very small groups, since to do so 
might reveal the attitude and answers of the individual members 
of the small group. 

An orientation focused on discovering better principles an 
methods of organization and leadership reassures persons who ma) 
feel threatened by the research. If they feel that the resear^ is to 
learn how to help them to do their job more successfully, they 
usually are eager to cooperate. This cooperation usuall^^ inacases 
as they see the research results used for this purpose rather than to 
discharge or demote chose whose work at present is not succcsslu . 


«nring Use^of Research Results 

INDUCING COOPERATIVE RATHER THAN 

Measurements of any commcixial, industria . or 

'^tion almost always show that some things arc -search 

'her things arc not being done so well. In examining these research 
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lesults, the officers o£ an organization can take primarily either a 
constructive or a defensive attitude toward the data. Fortunately, 
most officers take a constructive point of vietv. Occasionally, how- 
ever, a company officer or supervisor takes a defensive attitude and 
immediately becomes fearful when data are obtained which show 
that his operation is not now functioning in the best possible 
manner. His impulse, as soon as he has seen such research results, 
is to lock them up immediately so that no one else can discover 
that the operation is functioning imperfectly. Most company officers 
or supervisors take the opposite point of view when looking at 
similar data. Their reaction is to look at the results which present 
a favorable picture with pleasure but to look at them hastily. They 
then turn with genuine enthusiasm to the results which indicate 
where and in what manner the operation can be improved. They 
immediately share this information with their colleagues, subor- 
dinates, and all other relevant personnel in order that the necessary 
steps can be taken which will lead to further improvement in the 
company's performance. 

As we shall see, there is much that both the officers of an organ- 
ization and the research staff can do to prepare and assist the per- 
sonnel of the organization to take a constructive rather than a 
defensive attitude toward research results. 

PARTICIPATION IN PLANNING AND IN INTERPRETATION. If people 

are unfamiliar with a research project and know little about it, they 
arc not likely to undersund the findings or be interested in apply: 
ing them. Personal involvement not only decreases the barriers 
to the use of data, it increases the probability that the results will 
be understood and accepted. Particularly important, however, it 
yields positive motivation to apply the results. This involvement 
should include all those who can influence the application of the 
results and should begin at the very outset of the project and 
increase as the project reaches the analysis stages. To wait until 
research results are available before attempting to obtain participa- 
tion represents a failure really to use participation and is likely to 
lead to the full or partial rejection of the results. 

The effectiveness of participation and involvement depends 
upon the rate or timing of the efforts devoted to this purpose. 
There seems to be no substitute for taking adequate time at many 
points in the process. The first point occurs when an organization 
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IS considering whether to have research done on a problem it faces. 
If high-pressure selling is applied, resistance is likely to occur. On 
the other hand, if the problem and needs of the organization are 
examined carefully and consideration is given to the help that 
research can and cannot provide, without pressure for a decision, 
the officers of an organization usually are more likely to understand 
and accept the assistance that research can probably provide. Also, 
the research staff is more likely to understand the problem and be 
more able to design an efficient study than if the decision to pro- 
ceed with the study is made hastily. When an organization has 
decided after careful consideration that it will benefit from having 
research done on its problems and begins to press the research staff 
to have the research done, its officers are much more likely to be 
sufficiently interested in the study to lake the time and energy re- 
tjuired to become fully involved in the research. 

Obtaining the participation of the relevant personnel in the 
planning stages of a study yields two dividends. It enriches and 
improves the material used in planning the study, and it also 
achieves the desired involvement. A similar gain occurs in using 
participation in the analysis and interpretation phases of the re- 
search project. The knowledge of company operations possessed by 
company officials and employees makes them experts whose help is 
needed by a research staff in planning a study and interpreting the 
data.' 


presentation of preliminary findings and acceptance of 
*msEARCH RESULTS. The involvement and interest of the officers of 
an organization tend to ivane if the research staff waits until the 
completion of the analysis before presenting any results to them. 
Moreover, people usually can do a better job of inierpreung 
Remits if they are given time to assimilate gradually the major find- 
emerging in the research. If nothing is reported to a” * 
until the final analysis is presented, the officers 
''■th a body of data which often includes some resulo wh.A su^me 
’’em- TI,e research staff has U.en faced them wuh 
'? > "take it or leave it" situation, and neither alternat.ve .s de 
'■mble from the standpoint of the researchers. On 
''’’en the research staff presents to the officen of an 

preliminary inkiing of the probabie results but P««'^ 

“ '"■ghly teutaXe, the officers are not compelled .mmed.atel, 
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accept or reject the data Also, as further data are reported to them 
and progressively build a dearer and clearer picture of the results, 
the ofTicers follow it with interest During this period, they can test 
the validity of the results by using other evidence or clues This 
testing and discovering that tlie results are valid facilitates their 
acceptance 

People find it difficult to make a major change in their thinking 
rapidly It seems to require time for each of us to test new ideas and 
new results and gradually to disoiver their validity and to accept 
them Not until then are we willing to build decisions upon these 
new findings There seems to be no substitute for lime in this proc 
ess Whenever pressure is exerted to achieve changes in points of 
view or in thinking in unduly short periods of time, there is likely 
to be strong emotional resistance to it 

SEU AKAU SIS TECHNIQUES AND TI1F USE OF RESULTS * PaftlCipa 
lion in the form of self analysis is more likely to be followed by 
changes than if the analysis is made by someone else This point 
and the specific procedures used to implement it are not new Clin 
ical psychologists are vsell aware of tlic importance of self analysis 
for bringing about change (56) Some of the procedures vshich ap 
pear to be particularly effective follow 

(1) One important dimension of self analysis as used for exam 
pie, in Case A (pp G20 630) is that ii starts from and centers around 
objective measurements This tends to keep the discussion m an 
objective atmosphere There are no siatcnienis, reports, or rccom 
mendalions b) outside experts to which an individual or a group 
can take exception ihcie are only unemotional, objective data 

(2) In the scries of meetings described in Case A, no expert tells 
the group vslnt the data mean or what its problems are The inter 
prctations are v\oTkcd out by the members of the gioup themselves 
with the representative from the research organization there to 
ansiser technical questions which the group leader cannot ansvser 
Prior to the ineciing members of the research staff always stiidv 
the dan that are to be discussed At limes in the meetings the 
research staff members may ask questions about tlie d'lta to help 
focus attention on v\hat appear to them to be important points that 

l Tl c foUowin„ three «cctionf have l>een lalcn from Flojd C Mann and 
Rcmtr Ukiri The Need for Research on Communicating Research Results'* 
Hu mn Ofgantjt/roti 1952 11 ^ 15 19 
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are being overlooked By taking this role, the research ^epresenta 
live avoids making recommendations that may be ^ ^ ’S 

of the entire situation He also sidesteps many o t e in ivi 
and group protective mechanisms which are set into action u 
any real evaluation of the self or the organization in 
self IS deeply involved The use of questions to e p ocus a 
on the important problems and on the solutions ^ 

data produce solutions from the group and assure t eir p 

^ {3)^esLances have to be recognized and worJ-ed 
glossed over As might be expected, survey occurs re 

quite different from what the group expects \ 
sistances in different forms and degrees of overtness a 
and dealt with before the group can proceed agai 
objective If the handling of resistances is postpon 
surface later when u may be much more difficult ° ^ea' " ‘h^^ 
adequately or when it is too late to handle them a 
and are not dealt with effectively, they are apt to block 

structue action . i. ihp acecot 

(■5) Timing and paang are need m act upon 

ance of the data and g-uning recognition ^ 

them In those situations in which the survey 
ferent from what is expected it is necessary to p lempo 

preferably leiiing the '''seirm'thTIpced with which il 

This means letting the group picc itse f P dctermin 

considers the different aspects of the findings an 
mg the^depth to which the analysis ami «n-crpretanon of > ^ ^ 
will goli. any one niee.ing These two 

they tend to reduce the number of tiroes nnderstand or 

cause the group members arc not yet prcpai 
ready to accept certain findings as facts aimoipherc 

(5) ft dwell to present wbad, ding done'well 

emphasizing first the results stbich slio ..hirh can be im 

In prcHniing results dealing suth operation 
proJed It is^cll to orient the ■I-™”-" '“r^Xmidduon of 
suggest arc ways to male the ,„„on on wcalncsKa 

how to improve enhsis inlcrcsl. whereas co , 

or failures produces anxiety and an „hidi is an 

(d) Arhilrarv insistence that the daia are acrnraie 
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implicit demand that the others make an immediate redefinition ot 
the situation, serves only to increase the emotional resistance and 
the amount of time ultimately required to get the findings accepted 
and used. It is best to give the individual or group ways to save face— 
let them explore all of the different possible meanings which the find* 
ings might be assumed to have— before going ahead. One of the more 
important things that an outsider can effectively do here is to pro* 
vide the individual with the motivation to re-examine his psycho- 
logical field to see if there are not even better interpretations to the 
perceptual clues he has been getting and piecing together than the 
pattern which heretofore has satisfied him (32a). 

(7) The results should be in simple, nontechnical language. 
Shop language and graphical presentation should be used as much 
as possible. This facilitates self-analysis by making the group realize 
that the data deal with their situation and are not something be- 
longing to the research organization. 

ANALYZING DATA IN GROUPS. The ptocedute described in Case A 
(pp. 620 630) involves working with groups rather than with indi- 
viduals. Lewin (19, 20) and his students (10) have emphasized the 
power of the interacting forces exerted by group members on one 
another. Mann (31) finds that this is particularly applicable to a 
group’s discussion of data deling with its own work situation. 
Participation in group discussions and group decisions concerning 
future action sets into motion pressures for action which are more 
effective than when individuals alone are concerned. By working 
with and through groups, use is made of these continuing group 
forces. 

The group situation seems to be important for several reasons: 

(1) Through group discussions, the findings can be examined 
in a broader perspective because the group brings to the data experi- 
ence that is richer and more varied than that of any one indi- 
vidual. 

(2) Group discussions, by allowing the pooling and exchange of 
this wider range of information, also provide the psychological situ- 
ation in which superiors and subordinates at all levels can discuss 
possible solutions and thus give one another new and improved 
ways of viewing and solving their problems. 

(3) The discussion, by groups, of the research data compels all 
members of the group to recognize openly the existence of the 



The Utilliotion of Social Science 


617 


major problems revealed by the dau. Impormnt and 
leras which may have long been festering are bro“gh g 
atmosphere svhich leads to constructive attempts o j 

(4) Group discussions also help supervisors ^ 

what is expected of them by the group concent mg then relatto 

with subordinates, associates, and their own ’ nowerful 

(5) Group decisions “-ernjng the n«t steps^pu^P-^^ 

pressures in the form of reaproca! expec 

carry out the dedsions agreed to by the group. effective 

USE OF HtERAECHICAL SOURCES OF ize the 

application of research Bndmgs. jt js also essential to 

hierarchical structure m an organuat of the organ- 

utilize the power structures as perceive ^ should follow 

ization. Any series of meetings to achieve forces. Research 

a sequence which recognizes the should show how 

data presented to the S^°“P* ,he power roles of 
different groups in the organization p people at the top 

other persons in the line and staff gr P ; perceived as 

oi each organizational “ „„cise appmoiably more 

competent and P°''’“*.“'':"®,l°!."any other persons within it. 
influence on the organization man y secure the interest and 
One reason why it is so ‘"’PF^‘“", 3 „„ing research and intn- 
fuU support of top ofeuliar problems involved m 

preting the results arises from tn p . president of an 

applying the results of understand the research 

automobile company does not n -n-ine when he approves 

involved in developing a new automo » needs to know is that 
plans for putting it into production. present engine, 

it will perform better and requue ^ research results, however. 
The problem of applying ends upon full undetstand- 

is not so simple. Effective ^PP'*®*"" P_ 3 njgOTent of the organ- 
ing and use of the research by the t p manageni'f' 

ization. Consequently, it is earch and fully involved m 

fully informed of the progress 

the application of the research fin researcli, it helps to 

Even in physical science an research and identified wit 

have top management interested *n . those svho mus 

it. This interest encourages design ^ ^ closely, and it reduces 
^pply the results to follow the resear 
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the lag between the development of new knowledge and its actual 
use. 

USING DATA SO THAT IT PRESSES FOR ACTION. Business executivcs 
and government officials who have made extensive use of social* 
science research often point to the fact that they use the research 
results in such a way as "to make the data do the work of pressing 
for action." 


The Director of the War Finance Division during the last 
war used research results very effectively to press for action. One 
illustration will serve to show how he did this. He knew, for exam- 
ple, from his experience as Chairman of the X Slate War Bond 
Committee, that personal solicitation was essential if substantial sales 
of bonds were to be made to large numbers of individuals. He discov- 
ered, also, after he became Director of the War Finance Division, that 
the chairmen of many other states did not accept his experience as a 
guide to what they ought to do. They shuddered at the thought 
of having to recruit and train tens of thousands of volunteers to 
serve as solicitors in war bond campaigns. Consequently, when he 
urged them to use solicitation and cited his own experience, they 
would solemnly assure him that their state was different from 
-- • state and that personal solicitation was not necessary to sell 

bonds in their state. The result was an impasse and. at first, several 
stales did not use personal solicitation. 


A study of the effectiveness of the Second War Bond Drive 
provided the director with data such as that shown in Figure 1. He 
had a brief pamphlet prepared which showed these and related 
results. He distributed these pamphlets to all the stale and county 
war bond commitKes. He aUo had these result, presented in the 
egional meetings in which the plans for the Third War Bond Drive 
were discussed and developed. The net effect of using the data in 
I* t at war bond committees convinced themselves of the 

ue o so latation. They recruited and trained a much larger 
group of sohators. As a result, the number of people who were 
pen^ally «ked to buy war bonds increased from 25 percent in 
^e Second Drive to 50 percent in the Third, and the sales of Series 
E Bonds almost doubled. 


The impartiality of accurate measurements usually facilitates 
objective consideration of the facts, and this leads to the acceptance 
and implementation of effective policies. When decisions are made 
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by pitting one person's experience S' decision is 

another, there is usually disapeemen . followed 

not made, and any decision that is ma ^ 

only half heartedly by many of the ^"""SSieasurlmect. One 

APPRAISING USE OF RESEARCH RESU ^^hcne^er a 

characteristically human tendency is to assu 

Of all gainfully employed in the country, how rn / 
were PERSONALLY ASKED* to buy m second 



88% 



^ Didn't buy extra bonds 
«IncJwd« »t>o« 

*' Unforiunatel), ih» 

^ange is made it results in an ^-ay to svhcihcr a 

s not always the case. There is on y o jo 

diange has resulted in an to find that t ' 

the effect of the change. It is not unc pleasurable 

application of research results has change to 

cni. It may take two or three at 
/ .e Jn.He. . the 


it muy UKC twi/ jj, 

provement of any magnitude. influence the v. y 

A continuing problem which tends m „t „r^ 

dch research r«uL are applied “-"d to hm__ „^n!ration. » "•' 

lly controlled experimentation con ^ pirtiodat ^ 

'ati\ei^ common point of s'icss rniallv on onh the 

ouEht\obeBood, sihytes.UexJ^tim«J»^^^^^^ in the 

ganiiation? Tliosc w'ho hold tn>» 
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change be introduced generally, where applicable, so that all units 
will benefit and the organization derive maximum results from the 
change. A demonstration of the value of testing changes experi- 
mentally is often all that is required to convince an organization 
that usually it will derive greater benefits and derive them more 
rapidly if it tests the proposed changes experimentally before intro- 
ducing them generally. 


SOME ILLUSTRATIVE CASES 

In considering how to organize and conduct researdi devoted 
to the problems of a specific organization, it is important to differen- 
tiate two broadly different kinds of situations. This is necessary 
since this difference affects the way in which the research should 
be organized and conducted if effective use of the results is to be 
obtained. One general type of situation is that in which the research 
deals with problems involved with the organization's internal opera- 
tions. Case A below is of this type. In situations of this type, the 
resear^ operation is likely to have an appreciable impact on the 
organization and how it operates. Consequently, it is extremely 
important to have the full support and interest of the very lop 
officen of the organization and to have their full participation in the 
plans for Uie research and its application. Unless top management 
understands, seeks, and fully supports the changes being undertaken, 
and understands the character of the resistance to the changes and 
the reasons for this resistance, the attempts to apply the research 
will encounter real difficulty and may even fail. 

The second general type of research situation is that in which 
the research deaU with problems faced by the organization which 
^e outside of the organization. Studies concerned with the buying 
ayior of consumers or with citizen response to governmental 
activities illustrate this general kind of problem. Case B below is an 
example of this kind of research. 


Care A (A Synthetic Ccue) 

The Director of Industrial Relations in Company A approached 
a research organization one day with the statement that the com- 
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but that it had sorae long- 

pany had no immediate, pressing pro c, j jjy research. After 

range problems which he felt would e ^ consideration 

a lengthy discussion of some of „ likely to yield useful 

oi what kind of research design ^ research organ- 

results, the company director and t e ^ have another 

nation agreed to give the matter further ^ 
meeting involving other persons from ° , research for help- 

. Funher discussions of the Doctor of Industrie 

ing to solve the long-range ,hat Company A 

Relations to recommend to the Presi „ j gy a meeting of 
with the proposed research. This was o of the reear 

head of the research organization an -(jjnt and the Director 

staff with the President. Executive ''‘“ P' 50„ion, the potential 
of Industrial Relations of Company • length; some discos 

values of the research project were mieht be applie<l . 

Sion took place on how the application. On 

problems likely to be encountered m .yiously made, the 

of this meeting and the recommendatw P ^arch orgamzai , 

dent authorized the study to be l^^geld a similar « 
helote agreeing to proceed with the committee of the u 
meeting with the President and ^®“^„operation obtaine m 
local. The response and assurance research on the p 

meeting and the genuine interest m organization to P 

top management encouraged the res 

to undertake the proposed study. Comp ouF " * -,,ul( 

COMMENT: Many 

teorch with long-range objectives. lumover or j/nie 

Of more imnteliate problems. mpl« «/ ^ » 

stealing, and low producUvtty are a j assistanee i 

psoblems which may lead a ovssspvsf) problems ar 

research stag. Often these more to be so treats^ 

symptoms of fundamental probw'"’ .,,carch results. -jj a 

">a researchdesign and in applys^S f "/""rch was „ 

The first srtp in undertaking the ,,„,on. "orc 

^ries of planning sessions. The purp conip=*”y 

°"Tto obtain as much ” Sfo'! ■h' 

officer, and personnel as to the be sure .ha< ^ 

Problems. This information ^vas nc 
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design included all the major hypotheses that should be encom 
passea in the study 

2 To inform all interested persons and groups about the study 
in a way that would enable them to ask questions about the study, 
to express any fears or reservations they might have, to discuss these 
fears and to obtain all the information they wanted or needed 
in order to have sufiicient confidence in the study to cooperate 
fully with it 


3 To plan and conduct the study in such a way that all persons 
whose action was required to implement the study were fully in 
formed about it, were interested in it, and had an adequate oppor 
tunny to ask that the study obtain data they needed to make the 
best possible decisions on the problems falling within their areas 
of responsibility Taking sufiicient time to secure this involvement 
isas necessary for the results of the study to be implemented 

COMMENT It ts important to give all the people who are 
affected appreciably by a research project an opportunity to learn 
about if fully, to ask questions, and to make suggestions with regard 
to It Full access to in/ormafion about the study and as much involve 
ment and participation as ts feasible reduces fears of the research 
and increases the likelihood that the results will be used construe 
lively 


The planning sessions uere not compressed into too short a 
period of time and were arranged m such a way that those persons 
whose actions would be required to implement the study had an 
opportuni^ to participate and to raise the questions that were on 
their minds One of the important function, of these planning 
sessions was to stimulate thinking to cons ler possible courses of 
action which might improve present and future operations This 
objectives for the research by indicating the data 
q ire to e p in a choice among possible alternatiie courses 
of aaion * 


I important function which these planning sessions were 

i csign to perform was to help managerial and supervisory person 
etter aware of the character and magnitude of some 
ol the larger problems especially human problems, which they faced 
ten peop e are so immersed in day to day operations that they 
are not aware of some of their lai^r and more important problems 
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. 1 . Un.fnr the study proceeded and the actual collection of 

As the plans for the y P the company s 

the data got under way, important, informed of develop 

personnel, whose tmolv ^ P ,o them as 

ments This wm done g house organ This sense 

well as through t e '''P ^ 3 „d even increased by promptly 

of involvement was ma , results of the research 

reporting back some of e p j reports to inter 

COMMENT Progrerto/ the research, the.r 

esled persons and „fteLer anyone leels left out and 

interest tends to lag Moremi , suspicious attitudes 

uninformed, he tends to " ^ j„,is to keep all those per 

Consequently, whenever research project informed of 

sons who have a relationship ^^search 

developments and '“j ,^„,cwus attitudes, m the absence 

are fostered These fearfu inaccurate rumors about 

rs::-"K'7.»' « ■*“ — "" "" “ 

.I." r 

their preliminary and ‘'"““ ‘„,ht well change the interpreta 

minded that more detailed analy^^^ „ 

tion of the initial results An .11 .h^ie employees 

cite the experience of a comp y attitude toward the 

who svere union members had a ^,,„.onsliip however w-as 

company than non “"‘O" „ork of the employees was held 

changed as soon as the rharacer o 

constant Blue-collar worke^^d ^ 

the company than white collar vvo b,uo<ollar workers and 

membership was were changed Union members 

for white collar workers fmonble attitudes toward 

then were found to av members 

the company than non „E a company officer to meas 

Occasionally, the ’7p„„cular portion of the company 

urements which •^•■^^P,S„entty than the rest of the com 

was performing > ;”„,d be a change in ihe supervisory 

pany was to feel 3 ,, ,h- reaction top management in 


,vas perio. „ ,,iouId oe a change in the supervisory 

pany was to feel *at there ,„p enanagemen. in 

personnel involvrrf ^^Te tetl""”'""'’ "'^'’^''ement to help 
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supervisors and managers to develop the skills called for in their 
jobs There was continuous emphasis that the purpose of the re- 
search was not to perform a policing function but rather to find 
what prinaples and methods are associated with better performance 
and how to help mediocre units to improve Throughout the study, 
the focus was on discovering the methods which produce successful 
operauon and learning how to tram people to use tliese methods 
effectively It was important that supervisory personnel not fear the 
research as a policing operation likely to hurt them but view it 
rather as a major effort to help each of them to learn how to perform 
his job more successfully 

The results of the study in Company A fell into two distinct 
categories The first kind of results dealt with the measurements of 


company operations, either for the company as a whole or for sub 
units of the company The other kind of data was based on cross 
tabulations which, for example, showed the different levels of 
employee productivity associated with different methods of super 
vision This second kind of analysis, in addition to its value to Com- 
pany A, adds to the body of fundamental knowledge about the 
principles of organizational functioning and leadership Each of 
these two kinds of results was reported to the company penonnel 
In Company A, the results initially presented to company 
0 cers were of the first kind They were ’straight runs” showing 
resu ts or the company as a whole and for some of the major operat- 
ing epartments of the company These and similar data were pre 
tented first to the President and the Executive Vice president in 
a meenng which was organized by the Director of Industrial Rela 
presentation of the results was done by the member of the 
rcseardi organization who was in charge of the study 
nn nnai*^ Presentation did not include a detailed report presenting 
onlv t-Sir T of the data On the^^contrary, 

and presented and discussed The discussion 

the mpftinrr "^ere participated in by all of those present at 
as to *l““tions were asked by the President 

stantiallv in tl” ? toward first hne supervision differed so sub 
bv him ^r, t til r Tentative hypotheses were offereu 

lautTalln to possible causes Tabu 

hsnnthps ^ Completed by the research staff proved some of these 
hypotheses untenable and corroborated others Additional hypoth 
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«« that were suggested became guides for further analyses of the 
data, and these tother analyses were examined m subsequent meet- 
ings with these and other company officers^ oresident 

primarily to looting those 

pany policy. S-^Uhanges o«u^ One 

of the data and discussions wi ^ P in some 

cliange, for example, insolved an ‘"'P employees. Another 

of the benents provided by the „hich job 

change resulted >" thtd produced significant changes 

evaluation was performed, a uiuu ^ 

in the operation of the President and Executive 

At The end of ‘he ini-l - un^ P-'f^ , ies of 

Vice-president suggested ''’“j^ sijents present. Each group 
similar meetings with activities, 

was to include d'P'‘'®'''“'’*7rle„ted included not only the re- 
At these meetings the <!*'•“ ? " departments whose vice- 

suits for the company as a f P.jor divisions within 

presidents were presentbualsor««^^^^^.4 ^^,^i„,Uon and 

each of these departments. Ther company officers. A 

discussion of the findings by P ,he President and 

major area of discussion ^"“^pfanTas to how each of the 
groups of vice-presidents had ‘o P , j ^est go about analyzing 

d^artments and 'beir variom subum oon. 

and applying the results '/“^Lident priSent to his division 
It was decided to bnve each P division chief 

heads the results for his d^P", 'results for the division and or 
to present to his results were discussed with 

the sections and so on ,„n,iderme 

rank-and-file employees- groups comid^J 

difficuU were the /in*"®' 

objective impart, eUty of 
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group lo approach these problems tn a constructive, problem solving 
orientation Employees’ attitudes and feelings came to be facts 
instead of things to be disregarded because they appeared to be too 
difficult to handle or did not clamor for immediate attention 


One significant change was introduced when the results uere 
reported by each \ icc president to his immediate subordinates In the 
meetings with the President and the Executive Vice president, the 
results had been presented by the member of the research organiza 
lion who directed the study In the meetings conducted by each vice 
president for his staff, the responsibility for presenting the results 
was placed upon the vice president It t\as necessary for him to 
become sufficiently familiar with the methodology used m the study 
so that he could describe the methods used and answer any questions 
about the study which might arise during the presentation of the 
results For example, when some of the measurements indicated 
that a specific operation was not being done as well as might be 
expected, the people who were responsible for these operations often 
were likely to question the accuracy of the measurements In these 
ciraimstances, it was necessary for the \ ice president to be sufficiently 
well informed about the methods used and the magnitude and scope 
of possible ciTors so that he could ansiver these questions as to the 
probable accuracy of the results It was also necessary for him to 
point to other measurements and results winch supported the specific 
result being questioned 


Each vice president was provided wiih tabulations showing the 
resu^lts for the company as a whole and for Ins department and for 
ea^ dn ision within his department Results were available also for 
other related or comparable departments so that each vice president 
cou d show his group how they compared with other comparable 
res* * company Data were also made available by the 

esearch orpniration showing measurcmcnls for comparable opera 
uons in other companies ouuide Company A All these results 

mil 1^ 'tee president to discuss with his staff the 

pattern of rcsulu for Ins department 

lo.r nr of these meetings conducted by a vice president, a mem 
tnii 1 ^ ° Director of Industrial Relations iv-is present 

to htlp rnswer any questions that might arise In addition, a member 
i' useych oiganizaiion was also present to answer any specific 
qucMi ns a out the research svhich the sice president un„ht not be 
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prepared ro answer, and also to explain svi.a. 
tabulations eould be tnade. The groups svere 
additional tabulations wl.icl. would help 

existed, where in their departntents the ^ to 

caused the problem, and what ‘^'*^S“ ™^ ^ djli,ional tabula- 
improsc the situation. Most groups ashed for these 

tions. . . i«„t (Jivision hesd 

Alter these sessiotts sfith the vicc-prest ’ ‘ ,l,ey 

was ashed to conduct sitnilar meetings with '* ^ ; b- 

in turn ashed their section heads to ,Ler and 

ordinates. In this manner the results w P reported 

lower echelons in the company unt.l the results had P 

to all the employees ot the companj. ,,„ried nl '"P 

COAfAf^'r: The series <>1 

Ihe line oigitniznlimi and worked „ genuine Mered 

depailtncnls in which the people a P 

in the findings, studied them, and tried to IPJ -, ^^orking 
discussed more ade,,nately and used more ^ 

nut action steps than where such in j analysis and inter- 

A high degree ot personal “pervLr who was 

pretation was obtained throngh g . paj-tidpate in two 
engaged in any managerial or supervisory greetings in winch 
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group to approach these problems tn « constructive, problem solving 
orientation Employees’ attitudes and feelings came to be facts 
instead of things to be disregarded because they appeared to be too 
difficult to handle or did not clamor for immediate attention 

One significant change was introduced when the results is ere 


reported by each i ice president to his immediate subordinates In the 
meetings ivith the President and the Executiie Vice president, the 
results had been presented by the member of the research organiza 
tion who directed the study In the meetings conducted by each vice 
president for his staff, the responsibility for presenting the results 
was placed upon the vice president It was necessary for him to 
become sufficiently familiar with the methodology used in the study 
so that he could describe the methods used and answer any questions 
about the study which might arise during the presentation of the 
results For example, when some of the measurements indicated 
that a specific operation was not being done as well as might be 
expected, the people who were responsible for these operations often 
were likely to question the accuracy of the measurements In these 
circumstances, it was necessary for the \ ice president to be sufficiently 
well informed about the methods used and the magnitude and scope 
of possible errors so that he could answer these questions as to the 
probable accuracy of the results It was also necessary for him to 
point to other measurements and results which supported the specific 
result being questioned 


provided iMih tabulations showing the 
K company as a whole and for his department and for 

ea^ diMston w.thm hts department Results were available also for 
coni, I “niptirable departments so that each vice president 

'°h “"'P^red mth other comparable 
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maX n no TtT’T'" Cnn'P-'ny A All these results 

rttri of 1 c ' to dtsenss with Ins start the 

pattern of results for Ins department 

her conducted by a vice president a inem 

m hcl was present 

of lb tlfttrttons that might arise In addition a member 

ntirs n< -,1 ^ was also present to answer any specific 

q ' DU * le research which the \ice president nil lit not be 



The Ueilliotlon of Social Science 627 

I tn e\nhin whai kinds of additional 
iirrmred lo ans'Ncr atnl also to cn|» 

prcpirco to ^ jrrouns svcrc encouraged to ask for 

tabiihtioin coulil ^^oulcl help ihcin see %vhat problenn 

addmoml nbiilatiom ' . ^„,5 ,i,c problem occurred, whai 

cxmid rOierc in i icn I might be most likely to 

caused the problem at" {or these additional tabula 

improve the situation Most groups askcu 

lions I, the vice president each division head 

Alter these sessions '' ,^,ti, his subordinaies and they 

vsas asked to conduct sum . ^^ntluct meetings with their sub 
in turn nsked tlicir section were reported to lower and 

ordnntcs In this manner results had been reported 

lower echelons in the 

to nil the employees of the co p J started at the tof} of 

COMMENT The down II was found lliat in mosl 

the Itnr orgniiiinlio» and n ^ genuine interest 

depaitintiils m winch the l'“tP apply them the data were 

m the findings, studied more construetivefy in working 

discussed more adequate y a ,„ierest was lacking 
out aelinn steps than u 1 ‘re s j^^^^ni m the analysis and inter 

A high degree or persona supervisor who was 

prciaiion was obtained t iroug p.ri.cipate in 

engaged in any managerial or ^ „„re meetings in which 

kinds of mceiings First associates and under the 

he participated as a sf o" there were one or more meet 

Icidcrslup of Ins ehm' chief of his group and conducted 

“S - » 7 »- 

the data ai ,.,,„c<!ion 


ing com 
in the 


.npciicd him to ire --■J.i.e over all results so u.a. „e could 

IC °[i,',ch arose in division heads and sub 

answer tiucsti Jams oske ,auons arrived a, 

Some of the ' ee j,an. die mie i^^ 

ortlmales to repo which this w is done more 

I'ncussums an deparimonK paa.er action wa, 

liucrprclations 1 research japarimcms m whid, 

::troV;U;nTof Ihe^ om, .0 dness res.,hs 

■■ c™ 

rliL second inajoi use 



626 The Applieatipn of Research Findings 


pany A was to analyze the data to discover the pattern of relation 
ships that existed between such variables as organizational stnicture 
and managerial and supervisory practices on the one hand and 
employee productivity and job satisfaction on the other 

COMMENT Figures 2 and 3 show some of the relationships 
thrt were found and illustrate one of the methods of analysis used 
Such findings as these are not only of value to “‘Company A but 
they, and generalizations based on them and on similar results, 
make an important contribution to our fundamental knowledge 
about organizational functioning and leadership (15, 16, 17, 18 24) 
The pattern of relationships obtained from this analysis was 
used by Company A m three major ways 

(1) These relationships were studied by the top officers of the 
company in staff meetings along with related data from similar 
studies The purpose of this analysis by top officers of the company 
was to examine present company policies in the light of these rela 
tionships and to consider modifying those policies where the results 
suggested that a modification would bring improvement All the 
different kinds of policies affecting employee behavior and reaction 
were studied including poliaes dealing with such matteis as 
tvages, promotion, job evaluation, company benefits training and 
supervisory practices 


( ) The training department built much of us materia! for 
gaming supervisors in human relations skills upon results obtained 
rom t e research They used the research results for case mitenal 
, to some of the more common human relations 

in the company In the training sessions super 
wnnW changes m supervisory practices and procedures 

tion tn 1 ^ bring about improvement especially m rela 

mf«f fr<. problems which, the data showed occurred 

thai w ^ discussed the patterns of relationships 

behavT." of supcrv.soJy 

morale associated with high or with low productivity or 


patterns of relationship which were discovered along 
.. j 5 re ate material from other studies were made available to 
° management as they examined and studied the specific 
m-it ea ing with their own operation The purpose of 

ing ese results available when the measurements were being 
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studied was to help guide the decisions as to t\'hich course of action 
among the alternatives present tsould be most likely to result in an 
improvement. 
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'Tmployee-centered" supervisors 
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than "’production-centered'^ supervisors. . . 
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oJ management, after careful study and discussion of the research 
results, to introduce modifications in their operation which the per- 
sonnel involved felt would be likely to bring about an improvement. 
After these changes had been in operation for a sufficient period 
of time, top management authorized middle and lower levels of 
management to arrange with the research organization for whatever 
remeasurement was required in order to discover the effectiveness of 
the changes made. 

This remeasureraent, which was done in several different parts 
of the company, demonstrated that some of the changes introduced 
produced significant improvement and that other changes had a 
negligible effect. Some situations required took two or three cycles 
of measurement, analysis, attempted change, and remeasurement 
before improvement began to occur. 

Case B (Research on a Population Being Served by an Agency) 

Case B represents another type of situation. It differs from Case 
A in that the research is focused on persons outside the organization. 
In both cases the results were used to bring about improvement in 
the operation of the organization. In Case A, the changes involved 
internal relationships; in Case B, the changes were program changes 
to help the organization achieve its objectives more effectively. In 
both types of cases, those persons who are in positions of influence 
need to participate in planning the research and in decisions on 
applying the research. Unless this is the case, the research results 
are not likely to be applied. 

In the summer and fall of 1941, the U. S. Department of Agri- 
culture asked the Agricultural War Boards in the Great Lakes 
dairy states to undertake campaigns to increase the production of 
milk. This increase was necessary to meet the increasing demand in 
the United States as a result of increased purchasing power on the 
part of consumers and also to meet our commitments to England, 
to supply substantial amounts of evaporated milk, cheese; and dried 
skim milk.-' 

The officers in the Department of Agriculture who were respon- 
sible for this increase in milk production were eager to give the 
Agricultural War Boards in the Great Lakes states every assistance 
possible in order that their campaign to increase milk production 
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'vould meet with success. As a part of the assistance being given 
to the Agricultural War Boards, they asked the Division of Program 
Surveys in the Bureau of Agricultural Economia whether it would 
be tviiling to help the X State Agricultural War Board by conduct- 
ing a study to help guide its campaign to increase milk production. 
After the Division had indicated its willingness to make the study, 
the Department of Agriculture officer responsible for the milk pro- 
gram asked the Chairman of the Agricultural War Board in X State 
whether he would like this assistance. The Chairman was interested 


in obtaining all the help he could on the important and difiiatU task 
he faced and was glad to have the study undertaken. 

COMMENT: This situation illustrates a difficult problem often 
encountered in the conducting of socialscience research for an 
organization or agency. In this case, an arm of the federal govern- 
ment asked the research staff to make a study to help an agency of 
a state government. Even though this state agency had close func- 
tional ties with the federal agency involved, it was nevertheless proud 


of its autonomy and determined to maintain it. 

In such circumstances, if the research staff had appeared to be 
responsible solely to the federal agency and interested only in the 
problem as seen by the federal agency, animosities and frictions 
would probably have developed which could have created a situation 
making the research difficult to conduct or causing the results to be 
ignored. To achieve the cooperation, support, and involvement 
needed, it was necessary for the research staff to approach 
agency with a genuine and sincere interest in Earning w ta e 
officers of the agency fell their problems to be and what they would 


like to see studied. . . , . a 

This same kind of problem, of 
central staff of an organization ° a 

Of the operating units. It even o ^ department. In 

company asks to aslablisk a cooperative 
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not sure that research could help it or in what way The members 
felt that by and large, they knew the situation in X State well and 
that their campaign plans were sound 

The X State Agricultural War Board believed the following 
to be the facts 

(1) The State Agricultural Statistiaan's data for X State showed 
that the total number of cows in X State was larger than at any 
previous time in the states history Consequently, the War Board 
was of the opinion that virtually all the barns in X State were full 
and hence no further increase in number of cows was practicable 
at this time without an inaease in barn space The members also 
knew that their own barns were full and that the barns of all the 
farmers they knew were full 

(2) They knew that the price of milk in relation to the price 
of feed made it highly profitable to feed milk cows heavily, includ 
mg grains and protein concentrates, in order to increase the milk 
production per cow They were following this practice themselves, 
and all the farmers they knew were following this practice They 
assumed that this was true among farmers generally in X State 

Bwheving these to be the facts for X Stale, the War Board 
was of the opinion that the %vay to increase dairy production was 
(1) to assure an adequate supply of feed grams and protein concen 
iraies at a reasonable price (2) to facilitate the building of addi 
lional barn space and (3) to increase the available farm labor for 
daira operation Consequently the War Board proposed to ask that 
feed be made available at a specified price and that priorities be 
granted or allocations made so that farmers could obtain all the lum 
her concrete steel plumbing and other material required for 
budding additional barn space The members also felt it desirable 
that steps be taken to increase the farm hbor supply m X State 

W iih this background, the study was undertaken The major 
o jeCiives were to find the extent to which farmers were producing 
the maximum amount of milk and the steps which could be taken 
to make possible a further increase in dairy production These objec 
uses included discosering what resources if any, farmers felt the) 
needed to increase their dairy production Such resources were iddi 
tional barn space, milking machines other equipment more feed 
higher quality feed more labor etc The study also had as an objec 
tive discosenng the extent to which dairy farmers in \ State were 
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motivated to attempt to produce a maximum amount of milk and 
%vhat the influences were that were motivating them to increase dairv 
production and what the motivational forces were that were actin'^ 
in the opposite direction. This included seeking to discover how 
fanners felt about milk prices and the extent to which farmers knew 
that a price guarantee had been made on milk and dairy products 
through June 30, 19^3. It xtras also desired fo knmv the extent to 
which farmers knew that there was an urgent demand for an 
increase in milk production and the reasons for this increase in 
demand. 

The study was designed with a sufficiently large sample so that 
the results could be analyzed separately for each of three major 
milk-producing areas. These areas were the counties producing milk 
primarily for (1) cheese, (2) evaporated milk, and (3) dried skim milk. 
In each area a cross section of farmers and a sample of township 
AAA committeemen were interviewed. Interviews were conducted in 
a total of nine counties, three counties in each of the three areas. 

In each of these counties the three county AAA committeemen were 
also interviewed. 

During the time of the interviewing, the study director made 
an effort to drop in periodically for short visits with the Chairman of 
the State Agricultural War Board. He used these visits to report on 
the progress in the interviewing and to quote some of the answers 
being obtained in the intervieivs. He knew that many of the answeis 
were proving to be quite different from what the War Board ev 
pected, and he wanted to prepare them for the results. 

TVftf tfssctlis ieota intaTvisyi’s with the /aimers and 

the AAA committeemen proved to be quite different from svhat the 
State Agricultural War Board expected. Their expectations were 
based on the statistics and reports available to them and what thej 
learned from talking with their friends and acquaintances. But. 
unfortunately, they lacked important information. They did not 
have available any data reflecting the information, attitudes, and 
behavior of the rank-and-file of farmers in X State. 

The results, based on a cross-section of farmers, to quote the 


original report, svere: 

I Thtre is ample barn space lo accommodale considerable 
expansion in herds. . . . Thus, to 95 percerr. of rbe farmer, eon- 
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tacted, the lack of barn space is not a determining factor in 
deciding ivhether or not to expand . . . 

2. Labor supply m the state appears to be adequate, and 
the fear of a labor shortage docs not seem to be an important 
obstacle to an expansion in production. . . . 

3 Equipment needs appear to be no handicap to increased 
production. , . . 

4. An overwhelming proportion of X State farmers are satis 
fied with the present pnee of milk. . . . 

5 Running through all the data is the very noticeable thread 
that the fanner is uncertain, that he lacks confidence in the con 
tinuation of good prices, that he is apprehensive of a collapse 
of prices and markets after the war . . . 

It appears that memories of the catastrophe that struck 
farmers after the last war and the experience of the recent 
hard, lean depression years have sensitized farmers to the danger 
of debt . . 

The great importance of guarantees of security and price 
over a long period of time is apparent . . . 

6 A third major kind of brake on expansion is a distinct 
lack of information . . First, farmers arc confused and only 
partially informed about the actions of the government in regard 
to production and prices Second, they lack knowledge about the 
possibility of increased output through better feeding prac- 
tices. ... 


7 Expansion of production m dairying can. of course, come 
m one or both of two ways larger herds and better feeding The 
fint involves physical expansion, which may be hampered by the 
fears and uncertainties indicated above The latter, better fccd- 
g ^ done despite uncertainties However, better feeding 
w not t e means by which the average farmer will make large 
inaeases. ... 

Ninety four percent of those expanding production are en 
rging erds The overwhelming proportion of those inter- 
e e c t t at they were already at, or close to, the economic 
limit on feeding intensity 


vhov^rH ^ with township AAA committeemen 

showed a significantly different pattern from that obtained from 
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farmers. The following results quoted from the original report 
present some of the major differences that were found: 


1. Commiiteemen seem to be responding much faster than 
other farmers to the present fasorable conditions by planning 
to feed their cows more heavily this year. . . . 

2. There appears to be some evidence that committeemen 
anticipated the need for increased production of milk and are 
taking advantage of it by increasing their herds fully a year 
in advance of other farmers. . . . 

■ 3. Strongly affecting the foregoing is the difference in in- 
formation between the two groups. Threequarters of the com- 
mitteemen signified that they had heard of the government 
guarantees as opposed to only one-fifth of the fanners. 


The interviews in this study were made between September 
20 and October 1, 1941, and preliminary results were presented to 
the Agricultural War Board on October 2. The War Board was quite 
surprised by tlie results because the findings were almost the opposite 
of what it had believed to be the situation. After it had the results 
available, it began to check them by talking to county AAA com- 
mitteemen and various other groups of farmers. After a feiv days 
of these discussions, the board be'eame increasingly convinced that 
the results presented to it in the preliminary findings were substan- 

tially correct. . ,, 

The pattern that the board found was the one indicated by the 
study: the well-informed farmers, like themselves, knew of the 
increase in demand for milk, the reasons for this increase, and t e 
probable stability of the increased demand. Thee farmers, i 
members of the State War Board, had already im^eased cow num, 
hers to the point where their barm were full, and they s«ue non 
increasing pfoduction further by heavier feeding of ^ 2' 

centrates.^c'^untyAAAcommi.t.mendis^^^^^^ 

mm of behavior as the ^e like the rank-and. 

Township committeemen wer but ti.ey u>, 

file of farmers than ‘he pf infp^ 

r “fndTnTetLt to which ti.ey uem aimady i„„^ 
ing milk production. 
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COMMENT The research staff deliberately made much use of 
the pattern of results that was found It pointed out that the town 
ship committeemen’s attitudes and behavior were more like the 
results which the War Board had expected to find than were the 
attitudes and behavior of farmers generally This helped to show 
that informed farmers were bekamng as the War Board believed all 
farmers were behaving The War Board members also recognized 
that their own friends and contacts tended, like themselves, to be 
much better informed than farmers generally This made it easy 
to suggest that probably, as the results indicated, many farmers were 
not so well informed as the War Board had previously assumed 

After several days of studying the findings and testing them by 
talking with farm leaders and groups of fanners, the Agricultural 
War Board became convinced of the accuracy of the findings It then 
drastically revised its plans for the campaign to increase milk produc 
tion in X State The revised plans called, first, for a major informa 
iional campaign through the mass media, especially radio, m which 
there would be much emphasis u|>on 


(1) the great increase in demand for milk and milk products 
the reasons for this increase and the reasons why this increase 
in demand would not collapse suddenly 

(2) the price support afid guarantees to which the govern 
mem was committed through June 30 1943 

p) Ihc importance of increasing dairy production primarily 
by the heavier feeding of grams and concentrates 


the rfam ti'’™ u 'i"? «">paign through the mass media, 

the . November At these sdool d.stnct 

with the farm “"'"''K«men were to explain and discuss 

of * Ltrum ' ■" "■■Ik. the ohameter 

metMs “"■* g"-"""=os, and the best 

this increase m product.on through heavier 

called unon ^"■■■" campaign was to have ea^h farmer 

mt w^f former r"”"'’ ^ “■"■"■''=™an The purpose of thts 

and tTante^ the ““““S' A production 

and to answer the.r questions about the need tor thts mSease The 
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visit was to be combined with the usual Agricultural Conservation 
Program visit in which farmers were asked annually to sign up for 
approved soil-conservation practices. This visit for the combined 
purposes was to be held earlier in the crop year than the annual 
conservation-practice visit. 

It was recognized by the State U^ar Boaid that the township 
committeemen would have to be provided with full facts about the 
need for increased milk, the government-program price guarantees 
with regard to milk, and the advantages of increasing production b) 
heavier feeding. Material covering these points svas prepared by the 
War Board and supplied to all county agricultural officials. Plans 
svere also drawn for the visits that the township committeemen 
were to make to each farmer in order to explain the need for in- 


creased dairy production. ^ 

In order to test the adequacy of this plan, the materials pie- 
pared, and the training given township committeemen three expeii- 
mental counties were selected. During the period of October H to 
October 18 interviewers of the Division of Program Surveys accom- 
panied township committeemen whde they made their calls on 

'"'Tvim/Snltwere observed in the manner in which the 
townshr omm“ inducted the visits with farm^. One 

about the soil-conservation p g ,[,2 mgent require- 

much less about the milk “""P information and in the 

ment at the ^he spea were reported by the 

manner in which ™ Agricultural War Board. The 

Division of Pro^m Surv y . programs to overcome these 
War Board instituted loca „„„itteemen called upon farmers 
deficiencies before the Pg^^ throughout the milk-producing 

during the early part of N increase milk production. 

areas of X State to encourage mmtration 0/ how the 

COMMENT: „„ be operated. After the initial 

research-action-researc ^ results u’ere developed and the ade- 

study, plans for . pretested in an experiment or pilot 

qua^ oM^nerwire iher, oppraisad through measure. 
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ment, and the plan was modified and improved before more general 
application. When the problem is a continuing one, further measure- 
ments should be obtained after the general application. 

The X Slate War Board found the assistance given to it in this 
study of so much value that the Agricultural War Boards of the other 
Great Lakes dairy states, when they learned of the study, asked 
for similar assistance in their own states. Time and financial limita* 
lions made it impossible to conduct separate studies for the War 
Boards in each of the states, but representatives of the Division of 
Program Surveys met with these War Boards and described the 
results of the X State study, the major conclusions which came from 
it, and the specific applications that were made. Each of these State 
Agricultural War Boards then made corresponding applications in 
its own state. 

An indication of the value of this project can be obtained by 
examining the increase jn dairy production that occurred in X 
State during the ensuing twelve months. The increase in milk pro- 
duction for the year November 1, 1941 to October SI, 1942 over 
the preceding twelve-month period was 6.7 percent. This was one of 
highest rates of increase that occurred throughout the entire prewar 
and war period. Moreover, the study led the Slate War Board not 
to request the allocation of steel, lumber, cement, plumbing material, 
etc., which had originally been contemplated. This conserved scarce 
material for urgent needs in the ivar effort. 


Case C (A Story of What Not to Do) 

Case C, a condensed synthetic case, focuses on undesirable 
practices: reference to desirable practices has been largely omitted. 
Imir continues to be evidence that, 

^**”^”*. Sov^niment, and other agencies, attempts are 
made to use social-science research in situations in which the re- 

operating personnel. Such 
^ prevent participation and involvement by operating per- 
process and preveni the research staff from 
devel^ing a full understanding of operating problems. 

Ihu organwation was essentially a'service research unit for 
o^ra ing governmental agencies. It was headed by an administrator 
w o was competent but who happened to have no scientific training 
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or research experience. This deficiency made it impossible for him 
to judge the competence of the research personnel on his staff and 
to know to whom and to what extent he could delegate tasks and 
functions. It handicapped him seriously also in his attempts to 
judge whether a proposed research project or program was the best 
possible design or would even adequately meet the requirements 
of the operating agency requesting the research. ^ 

This administrator was an earnest man who took his responsi i 
ity seriously and recognized that research results mig t exercise^ 
tnajor influence in decisions on important matters. He a so 
uized that some of the methods of the social scientists , 

developed. This caused him to feel that they were ^ 

hence likely, at times, to yield erroneous results. As a co eq 
he established procedures involving checks and ^ 

Unfortunately, in order for his system to wor , 

^earch personnel from the operating personne ^ ft vir- 
imposed a condition on the research expected of it. 

tually impossible tor it to perform the , problem it cur- 
An agency desiring to have „,earch done but onl; 

rently faced could request to have t function by the 

through a liaison person designated to a rtsearch unit 

administrative head of the research ihroogh channels. 

°r section was then given the researc to meet the prob. 

The research staff then designed a channels. Usual^ 

i«ni as it was described to them tluo^g „ the proposed 

there was little opportunity to d's<^ 

research with the agency ujiy restricted contact ' 
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die research personnel and ‘^Voreanizauon s 
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research methodologic*'^^.^ 

l^nonnel how some of ■ , .^,!ng escudes usually have 

inrough researcli. .^onucK’l "F ■^i^dcncc research can 
comment: Th^rn/ " prablens they face. They 
"0 nmy of knouung Ibc . ,he saanj r 
ceniribu,, to the solultcb 
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obtain this information primanly by discussing their problems suf 
ficiently with social scientists for the scientists to become familiar 
with the problems and to indicate what kinds of assistance research 
might be expected to contribute Consequently, in situations such 
as this, in which the contacts between the operating personnel and 
the research staff are restricted to a minimum, there is little oppor 
tunily for the scientists to become familiar with operating problems 
and to suggest how research might be used to help solve them When 
the personnel of operating agencies are unable to obtain ideas and 
suggestions from the scientists, they necessarily limit their requests 
for research only to the very few possibilities which their restricted 
knowledge of research and its potentiality suggests 

Under such conditions, it is virtually impossible for the operat 
mg personnel to grasp the full magnitude of what research can do 
for them and to seek this assistance Similarly, the research person 
nel, without knowledge of the operating problems, do not recognize 
the need that operating personnel have for research and so are 
unable to suggest possible research projects In these circumstances, 
raearch is done only on minor aspects of problems or on superficial 
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the original analysis Frequently, however the original data could 
be retabulaied or analyzed further so as to yield significant facts 
to help answer the new questions But the liaison person was unable 
to suggest that this could be done because he was unfamiliar with 
the methodology and the data 

COMMENT If research for an operating agency is done well, 
it will yield generalizations applicable to many problems and policies 
and not just specific answers to speapc questions But if these gen 
eraltzations are to be used fully and correctly, it is essential that 
they be understood by the operating personnel An effective under 
standing of the generalizations usually is best acquired by discussion 
of their implications for specific operating problems This requires 
discussions of such problems by key operating persons with scientists 
who either have conducted the research or know it well When 
administrative regulations prevent such discussions the use of the 
research result* is seriously limited 

There is another kind of problem which occurs when a liaisor 
person without research training is placed between the research 
and the op^’rating personnel Often the most intelligent applicatioi 
of the research findings involves recognizing the presence of patteiiu 
of results and not merely considering individual items Moi cover 
t lese pattern interpretations may require knowledge not only of the 
fin tngs from this current study but knowledge of the results of 
ot ter research, including work done previously in this agency ana 
work done elsewhere and published The scientists who conduct th- 
research have this background knowledge of other research results 

ifluon person who lacks scientific train ng almost never has this 
ftnowledgff and consequently is less likely to be able to help the 
fnr patterns of results that have relevance 
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